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II.  Specific  Objectives 

The  overall  objective  is  to  investigate  the  robust¬ 
ness  to  departures  from  independence  to  methods  currently  in 
use  in  reliability  studies  when  competing  failure  modes  or  com¬ 
peting  causes  of  failure  associated  with  a  single  mode  are 
present  in  a  series  system.  We  shall  also  refer  to  such  com¬ 
petitive  events  as  competing  risks.  The  approach  will  be  through 
the  investigation  of  certain  aspects  of  specific  parametric  multi 
variate  distributions  or  by  classes  of  distributions  which  are 
appropriate  in  reliability  analyses  when  there  are  competing 
risks  present. 

The  specific  objectives  are: 

1)  to  assess  the  error  incurred  in  modeling  system 
life  in  a  series  system  assumed  to  have  indepen¬ 
dent  component  lifetimes  when  in  fact  the  com¬ 
ponent  lifetimes  are  dependent. 

2)  to  assess  the  error  in  estimating  component  param¬ 
eters  (i.e.,  component  reliability,  mean  com¬ 
ponent  life,  etc.)  in  a  series  system  employing 
either  parametric  or  nonparametric  models  which 
assume  independent  component  failure  times  when 

in  fact  the  lifetimes  are  dependent  and  follow 
some  plausible  multivariate  distribution.*  . 

3)  to  derive  bounds  on  component  reliability  when 
the  failure  modes  are  dependent  and  fall  in  a 
particular  dependence  class  (e.g.,  positive  quad¬ 
rant  dependence,  positive  regression  dependence, 
etc. ) . 

4)  to  develop  tests  of  independence,  based  on  data 
collected  from  series  systems,  by  making  some 
restrictive  assumption  about  the  structure  of  the 
systems. ** 

*  A  plausible  parametric  multivariate  distribution  will  be 
one  that  satisfies  one  of  the  following  conditions: 

i)  the  distribution  of  the  minimum  of  the  component 
failure  times  closely  approximates  widely  accept¬ 
ed  families  of  system  life  distributions. 

or  ii)  the  marginal  distributions  closely  approximate 
the  distributions  of  component  failure  times  in 
the  absence  of  other  failure  modes. 

**This  objective  has  been  added  to  the  original  objectives  be¬ 
cause  it  answers  a  natural  question  raised  by  our  preliminary 
investigation. 
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XII.  Introduction  to  Problem  and  Significance  of  Study 

Alvin  Weinberg  (1978)  in  an  editoria.1  comment  in  the 
published  proceedings  of  a  workshop  on  Environmental  Biologi¬ 
cal  Hazards  and  Competing  Risks  noted  that  "the  question  of 
competing  risks  will  not  quiet  lys  go  away:  corrections  for  com- 
peting  risks  should  be  applied  routinely  to  data. "  The  problem 
of  competing  risks  commonly  arises  in  a  wide  range  of  experi¬ 
mental  situations. ^  Although  we  shall  confine  our  attention 
in  the  following  discussion  to  those  situations  involving 
series  systems  in  which  competing  failure  modes  or  competing 
causes  of  failure  associated  with  a  single  mode  are  present# 
it  is  certainly  true  that  we  might  just-  as  easily  speak  of 
clinical  trials#  animal  experiments#  or  other  medical  and  bio¬ 
logical  studies  where  competing  events  interrupt  our  study  of 
the  main  event  of  interest  (cf .  Lagokos  (1979))  . 

Consider  electronic  or  mechanical  systems#  such  as 
satellite  transmission  equipment#  computers#  aircraft#  missiles 
and  other  weaponry  consisting  of  .several  components  in  series. 
Usually  each  component  will  have  a  random  life  length  and  the 
life  of  the  entire  system  will  end  with  the  failure  of 
shortest  lived  component.  We  will  examine  two-  situations  more 
closely  in  which  competing  risks  play  a  vital  role. 

First,  suppose  we  are  attempting  to  evaluate  system  life 
from  knowledge  of  the  individual  component  lifetimes.  Such 
an  evaluation  will  utilize  either  an  analysis  involving  math¬ 
ematical  statistics  or  a  computer  simulation.  ..  At  a  recent 
conference  on  Modeling  and  Simulation#  McLean  (1981)  presented 
a  scheme  to  simulate  the  life  of  a -missile  which  consisted  of 
many  major  components  in  series.  The  failure  distribution  asso¬ 
ciated  with  each  component  was  assumed  to  be  known  (usually 
exponential  or  Weibull'. )  To  arrive  at  the  system  failure  dis¬ 
tribution#  the  components  were  assumed  to  act  independently  of 
each  other.  Realistically#  this  may  or  may  not  be  the  case. 

If  the  component  lifetimes  were  dependent  for  any  reason#  the 
computed  system  failure  distribution  (as  well  as  its  subsequent 
parameters  such  as  system  mean  life  and  system  reliability  for 
a  specified  time)  would  only  crudely  approximate  the  true 
distribution.  The  first  specific  aim  of  this  proposal  is  to 
ascertain  the  error  incurred  in  modeling  system  life  in  a 
series  system  assumed  to  have  independent  component  lifetimes 
(i.e.#  risks)  when#  in  fact#  the  risks  are  dependent. 

Second#  suppose  we  wish  to  evaluate  some  aspect  of  the 
distribution  of  a  particular  failure  mode  based  on  a  typical 
life  test  of  a  series  system.  The  response  of  interest  is  the 
time  until  failure  of  a  particular  mode  of  interest.  Frequently 
this  response  will  hot  be  observable  due  to  the  occurrence  of 
some  other  event  which  precludes  failure  associated  with  the 
mode  of  interest.  We  shall  term  such  competing  events  which 
interrupt  our  study  of  the  main  failure  modes  of  interest  as 
competing  risks. 
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Competing  risks  arise  in  such  reliability  studies  when 

1)  the  study  is  terminated  due  to  a  lack  of  funds  or  the 
pre-determined  period  of  observation  has  expired 
(Type  Z  censoring) . 

2)  the  study  is  terminated  due  to  a  pre-determined  number 
of  failures  of  the  particular  failure  mode  of  interest 
being  observed  (Type  II  censoring) . 

3)  some  systems  fail  because  components  other  them  the 
one  of  interest  malfunction. 

4)  the  component  of  interest  fails  from  some  cause  other 
than  the  one  of  interest. 

In  all  four  situations,  one  may  think  of  the  main  event  of 
interest  as  being  censored,  i.e.,  not  fully  observable.  In  the 
first  two  situations,  the  time  to  occurrence  of  the  event  .of 
interest  should  be  independent  of  the  censoring  mechanism.  In 
such  instances,  the  methodology  for  estimating  relevant  reliabili¬ 
ty  probabilities  has  received  considerable  attention  (cf.  David 
and  Moeschberger  (1978) ,  Kalbfeish  and  Prentice  (1980) ,  Elandt- 
Johnson  and  Johnson  (1980) ,  Mann,  Schafer,  Singpurwalla  (1974) 
and  Barlow  and  Proschan  (1975)  for  references  and  discussion). 

In  the  third  situation,  the  time  to  failure  of  the  component  of 
interest  may  or  may  not  be  independent  of  the  failure  times  of 
'Other  components  in  the  system.  For  example,  there  may  be 
common  environmental  factors  such  as  extreme  temperature  which 
may  affect  the  lifetime  of  several  components.  Thus  the  question 
of  dependent  competing  risks  is  raised.  A  similar  observation 
may  be  made  with  respect  to  the  fourth  situation,  viz.,  failure 
times  associated  with  different  failure  modes  of  a  single  com¬ 
ponent  may  be  dependent.  For  a  very  special  type  of  dependence, 
the  models  discussed  by  Marshall-Olkin  (1967) ,  Langberg,  Proschan 
and  Quinzy  (1978) ,  and  Langberg,  Proschan,  and  Quinzy  (1981) 
allow  one  to  convert  dependent  models  into  independent  ones. 

If  no  assumptions  whatever  are  made  about  the  type  of  * 
dependence  between  the  distribution  of  potential  failure  times, 
there  appears  to  be  little  hope  of  estimating  relevant  component 
parameters .  In  some  situations,  one  may  be  appreciably  misled 
(cf.  Tsiatis  (1975),  Peterson  (1976)).  However,  as  Easterling 
(1980)  so  clearly  points  out  in  his  review  of  Bimbaum's  (1979) 
monograph 

"there  seems  to  be  a  need  for  some  robustness 
studies.  How  far  might  one  be  off,  quantita¬ 
tively,  if  his  analysis  is  based  on  incorrect 
assumptions?" 

The  second  specific  aim  will  address  this  important 
issue.  First  if  a  specific  parametric  model  which  assumes 
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independent  risks  has  been  used  in  the  analysis ,  it  would 
be  of  interest  to  know  how  the  error  in  estimation:  is 
affected  by  this  assumption  of  independence.  That  is,  if 
independent  specific  parametric  distributions  are  assumed 
for  the  failure  times  associated  with  different  failure 
modes  when  we  really  should  use  a  bivariate  (or  multivariate) 
distribution,  then  what  is  the .magnitude  of  the  error  in 
estimating  -component  parameters?  Secondly,  one  may  wish  to 
allow  for  a  less  stringent  type  of  model  assumption,  and  ask 
the  same  question  with  regard  to  the  estimation  error.  That  ‘ 
is,  if  a  nonparametric  analysis  is  performed,  assuming  in¬ 
dependent  risks,  when  some  types  of  dependencies  may  be 
present,  then  what  is  the  magnitude  of  the  estimation  error? 

The  third  specific  aim  will  attempt  to  obtain  bounds  on 
the  component  reliability  when  the  failure  times  belong  to 
a  broad  dependence  class  (e.g.,  association,  positive  quadrant 
dependence,  positive  regression  dependence,  etc.).  More 
details  will  be  presented  in  the  methods  section.' 

In  summary,  competing  risk  analyses  have  been  performed 
in  the  past  and  will  continue  to  be  performed  in  the  future. 
This  study  will  provide  the  user  of  such  techniques  with  a 
clearer  understanding  of  the  robustness  to  departures  from 
independent  risks,  an  assumption  which  most  of  the  methods 
currently  in  use  assume. 
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IV.  Progress  Report  on  Second  Year's  Work 

A  summary  of  the  first  year's  work  is  reported  in  the  annual  report 
dated  October  26,  1983.  The  paper  dealing  with  the  Gumbel  (1960) 
bivariate  exponential  model  appeared  in  the  August  1984  issue  of 
Technometrics .  A  copy  of  this  paper  will  be  included  in  Appendix  A. 

Also,  the  paper  dealing  with  the  asymptotic  bias  of  the  product  limit 
estimator  under  dependent  competing  risks  has  been  accepted  for  publica¬ 
tion  in  the  Journal  of  the  Indian  Association  for  Productivity, 
Reliability,  and  Quality  Control.  This  article  is  to  appear  in  1984. 
(See  Appendix  C  of  the  first  year's  annual  report.)  The  paper  on  tests 
for  independence  with  censored  data  has  been  revised  and  has  now  been 
accepted  for  publication  in  the  Proceedings  of  the  Conference  on 
Reliability  and  Quality  Control  held  at  the  University  of  Missouri  in 
June  1984.  The  paper  was  presented  as  an  invited  paper  at  that  meeting. 
A  copy  of  the  revised  paper  is  included  in  Appendix  B. 

A  paper  which  develops  improved  bounds  on  component  reliability 
based  on  system  data  is  displayed  in  Appendix  C.  These  bounds,  which 
are  tighter  than  those  of  Peterson  (1976),  are  obtained  by  specifying 
a  range  of  possible  concordances  for  the  various  modes  of  failure  in  a 
series  system.  This  work  was  presented  at  the  National  Statistical 
Meetings  in  Philadelphia  in  August  1984. 

Another  paper,  in  a  slightly  different  vein,  which  deals  with 
assessing  the  goodness  of  several  methods  of  estimating  the  survival 
function  (reliability)  when  there  is  extreme  right  censoring,  is 
displayed  in  Appendix  D.  This  work,  which  was  presented  at  the 
Spring  statistical  meeting  in  1984  in  Orlando,  Florida,  has  been 
accepted  for  publication  in  Biometrics. 

Another  paper,  which  deals  with  reduced  bias  estimators  of  the 
joint  reliability  function  for  the  Marshall-Olkin  and  Block-Basu 
bivariate  exponential  distribution,  has  been  tentatively  accepted  for 
publication  in  Sankya .  A  copy  of  this  paper  appears  in  Appendix  E. 

Finally,  work  is  near  completion  with  respect  to  evaluating  the 
consequences  of  erroneously  assuming  independence  when  modeling  system 
reliability  from  complete  component  information  for  all  three  Gumbel 
bivariate  exponential  models,  the  Downton's  bivariate  exponential 
model,  and  Oakes  bivariate  model.  It  is  anticipated  that  a  paper 
will  be  written  in  the  next  month.  This  work  will  be  presented  in  an 
invited  talk  at  the  Spring  statistical  meeting  in  Raleigh,  North 
Carolina.  Another  paper  will  be  written  by  January  1,  1985  which 
examines  the  impact  of  the  independence  assumption  on  the  magnitude 
of  the  estimation  error  in  estimating  component  reliability  and  mean 
life  from  data  collected  from  series  systems. 
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V.  Methods 

We  refer  to  pages  8-52  of  the  original  proposal  for  a  discussion 
of  the  general  methodology. 
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This  article  investigates  the  consequences  of  departures  from  independence  when  the  compo¬ 
nent  lifetimes  in  a  series  system  are  exponentially  distributed.  Such  departures  are  studied 
when  the  joint  distribution  is  assumed  to  follow  a  Gumbel  bivariate  exponential  model.  Two 
distinct  situations  are  considered.  First,  in  theoretical  modeling  of  series  systems,  when  the 
distribution  of  the  component  lifetimes  is  assumed,  one  wishes  to  compute  system  reliability 
and  mean  system  life.  Second,  errors  in  parametric  and  nonparametric  estimation  of  compo¬ 
nent  reliability  and  component  mean  life  are  studied  based  on  life-test  data  collected  on  series 
systems  when  the  assumption  of  independence  is  made  erroneously.  Systems  with  two  com¬ 
ponents  are  studied. 


KEY  WORDS:  Competing  risks;  Component  life;  Modeling  series  systems;  Robustness 
studies;  System  reliability;  Gumbel  bivariate  exponential. 


1.  INTRODUCTION 

Consider  a  system  consisting  of  several  components 
linked  in  series.  For  such  a  system  the  failure  of  any 
one  of  the  components  causes  the  system  to  fail 
Common  assumptions  made  in  modeling  and  ana¬ 
lyzing  data  from  such  a  system  are  that  the  compo¬ 
nent  lifetimes  are  independent  and  exponentially  dis¬ 
tributed.  Many  authors  have  considered  the  problem 
of  analyzing  a  series  system  with  exponential  compo¬ 
nent  lives.  For  example,  confidence  bounds  for  system 
reliability  assuming  independent  exponentially  dis¬ 
tributed  component  lifetimes  were  presented  in  Mann 
(1974)  and  Mann  and  Grubbs  (1974).  (See  Mann, 
Schafer,  and  Singpurwalla  1974  for  a  more  compre¬ 
hensive  review.)  More  recently,  work  invoking  the 
assumption  of  independent  exponentially  distributed 
lifetimes  has  been  presented  by  Chao  (1981)  and  Mi- 
yamura  (1982).  Estimation  of  component  parameters 
from  series  system  data  has  been  treated  by  Board- 
man  and  Kendclt  (1970)  in  the  context  of  independent 
exponential  component  lives.  Some  authors  suggest  a 
nonparametric  alternative  to  the  estimation  of  com¬ 
ponent  reliability  based  on  series  system  data  (com¬ 
pare  Kalbfleisch  and  Prentice  1980  and  Lawless  1982). 

The  assumption  of  independence  is  essential  to 
these  analyses  and  an  important  concern.  Several  au¬ 
thors  have  shown  that  this  assumption,  by  itself,  is  not 
testable  because  based  on  data  from  a  series  system, 
there  is  no  way  to  distinguish  between  an  independent 
and  a  dependent  model.  (See  Tsiatis  1975.  Peterson 


1976.  and  Basu  1981  for  a  discussion  of  nonidentilia- 
bility  results.)  In  many  situations  one  may  be  appre¬ 
ciably  misled  by  the  independence  assumption. 

Lagakos  (1979),  in  a  study  of  the  effects  of  various 
types  of  dependence  among  component  lifetimes, 
notes  that  most  methods  of  analysis  have  assumed 
noninfonnative  models  of  which  independence  is  a 
special  case.  He  points  out,  “it  is  important  to  be 
aware  of  the  possible  consequences  of  making  this 
assumption  when  it  is  false"  (p.  1S2).  Furthermore, 
Easterling  (1980)  states  in  bis  review  of  Birnbaum’s 
(1979)  monograph  on  competing  risks,  “there  seems  to 
be  a  need  for  some  robustness  studies.  How  far  might 
one  be  oft  quantitatively,  if  his  analysis  is  based  on 
incorrect  assumptions?”  (p.  131). 

In  this  article  we  consider  the  consequences  of  de¬ 
partures  from  independence  when  the  component  life¬ 
times  are  exponentially  distributed.  Such  departures 
mas  be  related  to  some  common  environmental  factor 
that  is  present  only  when  the  components  are  linked 
together  in  series.  The  load  each  component  is  subject 
to  is  either  reduced  or  increased  according  to  the  age 
of  the  system.  To  study  such  departures,  we  have 
selected  a  model  proposed  by  Gumbel  (1960). 
Gumbei's  model  retains  the  assumption  of  exponen¬ 
tials  distributed  component  lifetimes  while  allowing 
the  tlexibility  of  both  positive  or  negative  mild  corre¬ 
lation  between  component  lifetimes. 

The  effects  of  a  departure  from  the  assumption  of 
independent  component  lifetimes  in  a  scries  system 
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will  be  addressed  for  two  distinct  situations.  The  first 
situation  arises  in  modeling  the  performance  of  a  the¬ 
oretical  series  system  constructed  from  two  compo¬ 
nents  whose  lifetimes  are  exponentially  distributed. 
Here,  based  on  testing  each  component  separately  or 
on  engineering  design  principles,  it  is  reasonable  to 
assume  that  the  components  are  exponentially  distrib¬ 
uted  with  known  parameter  values.  Based  on  this 
information,  we  wish  to  calculate  parameters  such  as 
the  mean  life  or  reliability  of  a  series  system  construct¬ 
ed  from  these  components.  In  Section  2  we  describe 
how  the  values  of  these  quantities  are  affected  by 
departures  from  independence  when  the  component 
parameters  are  completely  specified.  In  Section  3  we 
study  the  performance  of  the  Mann-Grubbs  (1974) 
confidence  bounds  on  system  reliability  for  small 
sample  sizes  and  for  varying  degrees  of  correlation, 
when  the  component  parameters  are  estimated  from 
component  data. 

The  second  situation  involves  making  inferences 
about  component  lifetime  distributions,  reliabilities, 
and  mean  lives  from  data  collected  on  series  systems. 
Commonly,  data  collected  on  such  systems  are  ana¬ 
lyzed  by  assuming  a  constant-sum  model,  of  which 
independence  is  a  special  case  (compare  Williams  and 
Lagakos  1977  and  Lagakos  and  Williams  1978).  In 
Section  4  we  study  the  properties  of  the  maximum 
likelihood  estimators  of  component  parameters  calcu¬ 
lated  under  an  assumption  of  independent  ex¬ 
ponential  component  lifetimes  when  the  component 
lifetimes  are  Gumbel  bivariate  exponential.  Because  of 
the  widespread  use  of  nonparamctric  estimates  of 
component  reliability,  we  also  present  in  Section  3  the 
estimation  error  of  the  Kaplan-Meier  ( 1 938)  estimator 
when  the  assumption  of  independence  is  made  er¬ 
roneously. 

2.  MODELING  SYSTEM  RELIABILITY  FROM 
COMPLETE  COMPONENT  INFORMATION 
Consider  a  two-component  series  system  with 
component  life  lengths  Y„  X2.  Suppose  that  X,  has 
an  exponential  survival  function 

f,(t)  -  P[X,  >  t)  «  exp  (-2(f), 

A,,  t  >  0,  i  •  1, 2. 

This  assumption  is  made  on  the  basis  of  extensive 
testing  of  each  component  separately  or  on  knowl¬ 
edge  of  the  underlying  mechanism  of  failure.  The 
value  of  A,  is  assumed  known.  If  Y„  Y2  are  indepen¬ 
dent,  then  the  time  to  system  failure  has  an  ex¬ 
ponential  distribution  with  failure  rate  A  =  A,  +  A2, 
and  the  system  reliability  is  given  by 

Fill)  “  P[min  (.Y,.  Y,|  >  1 1 independence] 

«exp(-Ar).  (2.1) 


Suppose  that  the  actual  joint  distribution  of  (Y,. 
Xz) has  the  form  proposed  by  Gumbel  (I960),  namely, 

P(Y,  >  -x,.  Yj  >  x2)  =  [exp  (-A,x,  -  /.j  xj)] 

x  [  1  +  at  1  -  exp  ( -  <tj  x,)X  1  -  exp  ( -  Aj  x2))].  (22) 

The  joint  probability  density  of  ( X , ,  X2)  is 

fix „  Xj)  =  A,A2[exp  (— A,x,  —  AjXj)] 

x  [1  +a(2  exp  (  — A,x,)  -  1) 

x  (2  exp  ( -  A,  x2)  -  1)],  (13) 

where  in  both  (12)  and  (13),  x,,  x2,  A„  A2>0, 
—  1  1.  This  distribution  has  marginal  survival 

functions  equivalent  to  those  for  the  independent 
model,  which,  in  part,  is  the  reason  for  choosing  it 
The  correlation  between  Xx,  X2  is  p  =*  a/4,  and  a  =  0 
is  equivalent  to  Yt,  Y2  being  independent.  For  p  >  0 
(<0)  the  components  are  positively  (negatively) 
quadrant-dependent  (see  Barlow  and  Proschan  1973). 
Furthermore,  the  conditional  expectation  of  Y,,  given 
X  2  =  x2,is 

£(Yt  1  Yj  =  x2)  =  r-  [1  +2p  —  4 p  exp  (-A2  x2fl. 

Ai 

If  (Y„  Y2)  have  the  joint  distribution  (13),  then  the 
true  system  reliability  is 

Ft £t)  =  P[min  (Y„  Y2)  >  1 1  dependence] 

-  exp  (-A/)[l  +  4p(l  -  exp  (-A,r)) 
x  (1  -  exp  (-Ajt))].  (2.4) 

From  (11)  and  (14)  we  see  that  the  error  in  mod¬ 
eling  system  reliability  is 

A(t) »  F^t)  -  F,(t) 

-  4p[l  -  exp  (— A,r)][l  -  exp  (-A2t)] 

xexp(— (At  +A2)t).  (15) 

Note  that  |  A(r)  |  increases  as  |  p  |  increases,  for  fixed  A„ 
A2 ,  and  t.  The  magnitude  of  A(r),  of  course,  depends  on 
At,  A2 ,  t,  and  p.  When  A,  «  A2  -  4>,  one  can  show  that 
A(t)  is  maximized  at  t  —  [In  2]J4>  (fixing  p  and  4>).  The 
value  of  |  Alt) |  at  this  point  is  |p|/4,  which  is  at  most 
1/16.  Representative  values  of  F^t)  for  A,  *I,A2» 
1.5,  and  p  -  -.25,  -.125,0,  .125,  and  25  are  plotted 
in  Figure  1.  The  curve  with  p  «  0  corresponds  to  the 
system  reliability  if  the  assumption  of  independence  is 
true.  Since  most  applications  of  interest  involve  relia¬ 
bilities  of  .75  or  greater,  in  Figure  2  we  plot  the  ratio 
of  the  100  pth  upper  percentiles  under  dependence  and 
independence  versus  the  correlation.  From  Figure  2  it 
appears  that  when  the  predicted  system  reliability 
under  independence  is  greater  than  .90,  misspecifying 
the  dependence  parameter  has  little  effect  In  the  range 
where  the  predicted  system  reliability  under  indepen¬ 
dence  is  less  than  75,  however,  misspecifying  the  de- 
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Figure  1.  System  Reliability  for  Gumbe/'s 
Model.  ml,  Aa*7  5. 

pendence  parameter  may  lead  to  errors  exceeding  6%. 
Maximum  values  of  |  A(r)  |  are  presented  in  Table  1  for 
ii»l  and  various  values  of  A2. 

The  mean  time  to  system  failure  based  on  (21X 
assuming  independence,  is 

H,  «  1/(A,  +  A2X  (26) 

and  that  based  on  (24)  is 

1 

"'“(A.  +  AJ 

+  ^a,  +  a2)  ~  (2/j  + a2)  -  (A,  +  2;J- 


The  amount  of  error  in  modeling  system  mean  life  is 

_ 6  fiMi _ 

~  "  (A,  +  A2X2A,  +  A2XA,  +  2AS) 

6pA|A}  p, 

(2A,  +  A2XA,  +  2A2)’ 


whose  absolute  value  obviously  increases  as  |p|  in¬ 
creases.  If  ■  A2 ,  this  error  reduces  to  2pp,/3,  which 
has  a  maximum  absolute  value  of  p,/6. 

It  is  apparent  from  Table  1  and  Equations  (2.5)  and 
(28)  that  the  error  in  modeling  system  reliability  and 
mean  system  life,  based  on  independence,  increases  as 


e<ra>i4i« 

,  -I-Ct-'-C.  »V3C 


<.H  -O.IS  -O.os  O.OS  0.1S  O.iS 
CORRELATION 

Figure  2  Ratio  of  100  pth  Percentile  Under  De¬ 
pendence  and  Independence  Versus  Correlation  for 
A,  «7,A2«75. 

|  p  |  increases  and  is  a  function  of  the  relative  sizes  of  A, 
and  A2.  In  particular,  when  the  mean  life  of  one  com¬ 
ponent  is  substantially  greater  than  the  mean  life  of 
the  second  component,  then  the  behavior  of  the 
system  is  well  approximated  by  the  behavior  of  the 
shorter-lived  component  acting  alone.  This  can  be 
seen  in  (24)  and  (27)  by  letting  A!  -♦  0.  In  this  instance 
we  also  see,  from  (25)  and  (28),  that  the  amount  of 
error  incurred  by  assuming  independence  is  negligible 

3.  ESTIMATING  SYSTEM  RELIABILITY 
FROM  COMPONENT  DATA 
A  common  practioe  in  predicting  system  reliability 
is  to  test  each  of  the  components  independently  and 
then  to  use  the  data  to  obtain  confidence  bounds  on 

Table  1 .  Maximum  Values  of  |  A(f)  |  for  A,  =  1 
and  Various  Values  of  A2 

l.  Atar|A(r)| 
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system  reliability  These  bounds,  obtained  by  Mann 
and  Grubbs  (1974).  assume  that  the  component  life¬ 
times  are  exponential  and  that  the  components  act 
independently  when  linked  in  series.  In  the  bivariate 
case  the  bounds  are  computed  as  follows:  For  the  jth 
component,  suppose  that  itj  prototypes  have  been 
tested  until  rt  failures  occur.  Let  Zi  be  the  total 
time  on  test  for  the yth  component.  Define 


and 


I  )/z,  + 


1  in  -  \)/z ? 

Y.i’-j-ivzy 


(3.1) 


V 


=  I  (rj  -  1  )tz]  + 


T  ('<  -  l)/2f 

I  (n  -  \yzy 


(3.2) 


An  approximate  y-level  lower  confidence  bound  for 
•ystem  reliability  at  time  r„  is 

exp  [-t.M*{l  -  Vtm*1)  +  »»,(»'*)I/2/(3M‘)}3]. 

(3.3) 

vhere  rj.t  is  the  100/  percentile  of  a  standard  normal 
andom  variable. 

When  the  system  being  evaluated  has  dependent 
:omponents,  these  bounds  may  be  misleading.  The 
iroblem  is  that  component  data  are  independent, 
ince  the  components  are  tested  separately,  but  when 
hey  are  put  together  into  a  system,  some  interdepen¬ 
dence  may  develop.  Of  course,  such  dependence  is  not 
detectable  in  the  absence  of  some  system  data,  since 
the  data  on  components  wc  see  are  independent.  To 
study  the  performance  of  the  bound  (3.3)  when  the 
correct  system  model  is  the  Gumbel  model  (2.2),  a 
simulation  study  was  performed.  For  each  simulated 
sample,  nt  observations  from  exponential  populations 
with  mean  1/Ay,  j  =  1,  2,  were  simulated.  The  two 
samples  were  generated  independently.  The  confi¬ 
dence  bound  (3.3)  was  obtained.  This  was  then  com¬ 
pared  to  the  true  system  reliability  at  various  p’s 
obtained  from  (2.4).  Ten  thousand  such  bounds  were 
simulated  for  each  set  of  parameter  values.  The  esti¬ 
mated  coverage  probabilities  for  the  Mann-Grubbs 
bounds  (i.e„  the  proportion  of  times  that  the  Mann- 
Grubbs  intervals  assuming  independence  actually 
contained  the  true  system  reliability)  for  «,  *  n1  ®  3, 
5,  10,  «  1.0,  »  1,  5,  at  tm  *  .1  are  reported  in 

Table  2.  Here  the  true  system  reliability  under  depen¬ 
dence  ranges  from  .7684  at  p  -  -.25  to  .7891  at 
p  =  .25,  with  a  value  of  .7788  at  p  =  .0. 

The  results  in  Table  2  show  that  at  high  negative 
correlations,  the  coverage  probabilities  are  signifi¬ 
cantly  lower  than  claimed  under  independence,  and 
for  a  high  positive  correlation,  the  intervals  are  con¬ 
servative.  This  trend  becomes  more  exaggerated  as  n,. 
n,  increase  because  the  bound  approaches  the  reliabil¬ 
ity  under  independence.  As  seen  in  Section  2.  the  true 
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Table  2  Estimated  Coverage  Probabilities 
for  Mann  -  Grubbs  Bounds 
_  =  10.;.!  =  is) 

Correction 


n. 

7 

-  25 

-  15 

-OS 

0 

OS 

IS 

2S 

3 

3 

.95 

93  41  * 

94  11* 

94  74 

95  05 

9527 

95  80* 

9622' 

3 

3 

.90 

S' 4 0* 

8842* 

89  32* 

89  78 

90.20 

91.18* 

92  15' 

3 

3 

.75 

71  03* 

72  53* 

74  13* 

74.88 

76  58 

77.34* 

78  81' 

S 

5 

95 

9319* 

94  04* 

94  90 

95  26 

9562* 

96.17* 

96  81' 

S 

5 

.90 

87  12* 

88.48* 

89.85 

90.39 

91.10 

9213* 

93.14- 

s 

5 

.75 

69  68* 

72.02* 

74.10* 

7513 

76. 14* 

78.32* 

80.C3' 

10 

10 

35 

92  03* 

9342* 

94  58 

95  08 

9551* 

96.42* 

97  14' 

10 

10 

.90 

8593* 

87  70* 

89  34* 

90.21 

91  05* 

92  56* 

9393' 

10 

10 

.75 

67  77* 

70.90* 

74.12* 

75  63 

77  05* 

79  87* 

8256' 

*  At  teas*  two  standard  fton  «bovt  specified  level. 

*  At  toast  two  standard  errors  below  specified  level. 

NOTE :  Standa'd  errors  of  the  above  estimates  are  approximately  .2  for  the 
.95  level.  .3  for  me  90  level,  and  .4  for  the  .75  level. 


reliability  at  r  is  an  increasing  function  of  p  so  that 
asymptotically  coverage  probabilities  approach  0  (or 
I)  for  p  <  0  ( >  0).  For  sample  sizes  in  the  range  of  3  to 
10,  the  estimated  coverage  probabilities  for  p  <  0  are 
statistically  significantly  lower  than  expected.  On  the 
practical  side,  however,  they  are  not  of  sufficient  mag¬ 
nitude  to  cause  great  concern,  especially  at  y  —  .95. 

4.  PARAMETRIC  ESTIMATION  OF 
COMPONENT  PARAMETERS 

In  this  section  we  arc  interested  in  examining  how 
the  independence  assumption  affects  the  magnitude  of 
the  estimation  error  in  estimating  component  mean 
life  from  data  collected  on  series  systems.  That  is,  for 
each  system  tested,  we  observe  its  failure  time  and  an 
indicator  variable  that  tells  us  which  component 
caused  the  system  to  fail.  We  are  interested  in  how- 
varying  degrees  of  dependence  affect  the  bias  and 
mean  squared  error  (MSE)  of  the  maximum  likeli¬ 
hood  estimator  of  component  mean  life  obtained  by 
assuming  independent  component  lifetimes. 

We  assume  that  the  two  components'  survival  func¬ 
tions  are  f,(t)  *  exp  ( —  A,  t),  i  ■  1,  2,  and  a  life  test  is 
conducted  by  putting  n  systems  on  test.  We  observe  n, 
systems  failing  because  of  failure  of  the  ith  component, 
i »  I,  2.  Let  T  denote  the  sum  of  all  n  failure  times. 
From  Moeschberger  and  David  (1971).  the  maximum 
likelihood  estimator  of  ,  assuming  independence,  is 

=  n,/T.  t  =  1,2, 

so  the  estimator  of  component  mean  life,  //,  =  ~  \  is 
fit  m  T/nt  if  /i,  >  0.  (4.1) 

Now  suppose  that  wc  are  in  fact  sampling  from  the 
Gumbel  distribution  (2.3).  For  this  model,  component 
mean  life  is  the  same  as  in  the  independent  case.  The 
random  variables  (nf,  T)  are  independent  (the  con¬ 
ditional  distribution  of  T  given  n,  is  free  of  n,|.  and  n.  is 


DEPARTURES  FROM  INDEPENDENCE  IN  SERIES  SYSTEMS 


281 


binomial  with  parameters  »i  and  />,  -  /’(min  l.V,. 
A':)  =  A',).  For  this  model, 

P,  =  f>(X,  <  A',) 

_  ;  f  1  +  4^;.,  -  /■;)/■; _ ) 

(/,  +  /.2  (/.,  +  /.jX2/.,  +  /.jK/.t  +  2/.2)J 

(4.2) 

with  p2  ■  I  -  pt.  From  Mendenhall  and  Lehman 
(I960),  approximations  to  the  moments  of  l/«,,  con¬ 
ditional  on  n,  >  0,  arc 

£(  1  /n(  |  n,  >  0)  »  -j — 7-  (4.3) 

n{a  -  1) 

and 


lation  am!  dominate'  lor  large  n.  approaching  the 
limit  of  2o;i,  a 
When  /  ,  =  . 


MSE  Ip,  I  = 


2ji;ln:  -  2>i  -  3l 
rtln  -  5t>t  —  3) 


2pf(n  -  21 
9 nln  -  5 H»  -  3) 


x  ;(I9ii  -  2II,»  +  2(n  -  3Xn  -  Dp1}.  (4.11) 


As  in  the  bias  expression,  the  MSE  reflects  a  sampling 
error  term  and  a  modeling  error  term.  The  modeling 
error  is  a  quadratic  function  of  p  for  fixed  n.  For 
n  >  5.  this  error  is  increasing  in  p  for 

_  1  (19n  -  21) 

P>  4  (n  -  3Kn  -  I) 


£(l/n,2 In,  >  0) 


(n  -  2Xn  -  3) 
n\a  -  lXa  -  2)’ 


(4.4) 


where  a  -  (it  -  1  )p,.  The  expected  value  of  T  is  given 
by  np0,  where  pD  is  given  by  (2.7),  and 


E(Jl)  =  8p((2;.,  + 


(A,  +  2x2)1)J 


+  «(«  -  1)Po-  (4  5) 


Thus,  the  bias  and  MSE  of  p<,  conditional  on  n,  >  0, 
under  this  model  are 


and  decreasing  in  p  for 

_  1  (19n  —  21) 

P<  4  (»t  -  2Xn  —  1) 

For  sample  sizes  between  5  and  21,  the  modeling 
error,  and  hence  the  MSE.  is  a  strictly  increasing 
function  for  alt  p  e  [—  i.  £].  For  n  >  21,  the  mini¬ 
mum  MSE  is  achieved  at  p  <  0.  As  n  approaches  ae, 
the  value  at  which  the  smallest  MSE  occurs  tends  to  0. 

For  unequal  component  means  a  similar  result 
holds.  Figure  3  shows  the  bias  as  a  function  of  p  for 


fi(A)  *=  £(A  —  Pi) 


(n  -  2)p0 

[("  ~  IXPi  ~  1)] 


(4.6) 


and 


MSE  0i()  -  £(T2)£(  1/n?  |  n,  >  0) 


2p,(n  -  2)p0  , 

[(«-l)A-l]  p" 


(4.7) 


fori  ■  1,2. 

We  note  that  for  large  samples. 


fort 


lim  B(p,)  =  ^-p, 
I-  flO  Pi 


lim  MSE  (p,) 


1,2.  For/,, 
B(P.) 


/.2  from  (4.6),  we  see  that 


1+2 (n  -  2)p/3 
"“in -3)  * 

Pi  2| n  -  2)p;i, 

n  -  3  3 in  -  3)  ’ 


(4.8) 

(4.9) 


(4.10) 


A  similar  expression  holds  for  B(fi ,).  Note  that  (4.10) 
consists  of  two  terms.  The  first  term,  reflecting  sam¬ 
pling  error,  is  positive  for  all  n  and  dominates  the  bias 
expression  for  small  n.  The  second  term,  reflecting 
modeling  error,  takes  on  the  same  sign  as  the  corrc- 


Figure  3.  Bias  of  p,  Under  Cumber s  Model  for 
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Figure  4.  Ratio  of  JMSE  (fit  |  p)/MSE  (p, 1  p -  0) 
for  Various  Sample  Sizes  n  and  for  A,  ■  1 ,  A*  =  1 S. 


various  sample  sizes  when  A,  =  1.0  and  X2  -  1-5. 
Figure  4  depicts  the  ratio 


conditional  survival  probabilities,  namely,  by 


F^x)  =1  if  x  <  xj(1, 


JU.  XI 

=  n 


j*  * 


"  ~  r„ 
n-riJ+ l' 


x  ^  x«u» 


where  j(i,  x)  is  the  largest  value  of  j  for  which  <  x. 

A  special  note  is  needed  to  cover  the  case  in  which 
xHmi)  is  not  the  largest  observed  death.  To  avoid  this 
problem,  we  shall  define  F/(x)  =  0  for  x  greater  than 
the  largest  observed  failure  time. 

If  the  component  lifetimes  in  fact  follow  the 
Gumbel  bivariate  exponential,  we  can  see  that  the 
Kaplan-Meier  estimator  is  not  consistent.  For  i  —  1, 
the  Kaplan-Meier  estimator  is  not  estimating  F^t), 
but,  rather,  another  survival  function,  J?,(t),  given  by 


-exp 


f*  [1  +4p(l  —  e"*2^!  — 2e~i'*)]  ) 

!o  [1  +4p(l “j* 


t  >  0.  (5.1) 


Note  that  if  A,  »  X2  —  (5. 1)  is  simplified  to 

/?,(/)  «  <‘*'[1  +  4p(  1  -  e-*yy\  (52) 

which  is  increasing  in  p.  Similarly,  F2(t)  is  actually 
estimating  H2(t),  which  is  defined  analogously. 

Measures  of  the  error  in  estimating  F,(r)  by  /ri(t)  are 


v/MSE  (p,  |p)/MSE  (pt\p  —  0) 

as  a  function  of  p  for  various  sample  sizes  when  A,  - 
U,»1.5. 

5.  BIAS  OF  THE  PRODUCT 
LIMIT  ESTIMATOR 

A  second  approach  to  the  problem  of  estimating 
component  parameters  is  via  the  nonparametric  esti¬ 
mator  proposed  by  Kaplan  and  Meier  (1958).  Investi¬ 
gators  who  routinely  use  nonparametric  techniques 
may  take  this  approach  in  hopes  of  obtaining  esti¬ 
mators  that  are  robust  with  respect  to  the  assumption 
of  exponentiality.  The  purpose  of  this  section  is  to 
show  that  such  estimators  are  not  necessarily  robust 
with  respect  to  the  assumption  of  independence  when 
the  marginals  are,  in  fact,  exponential. 

The  product  limit  estimator,  assuming  independent 
risks,  is  constructed  as  follows.  As  before,  suppose  n 
systems  are  put  on  test  at  time  0  and  n,  systems  fail 
owing  to  failure  of  component  i.  Let  Jf((U, ..., 
denote  the  ordered  times  at  which  these  n,  events 
occur,  and  let  r,,, . . . ,  rimi  be  the  ranks  of  those  ordered 
survival  times  among  all  n  ordered  lifetimes.  The  com¬ 
ponent  reliability  for  the  ith  component  at  lime  x  may 
now  be  estimated  by  the  product  of  the  individual 


Figure  5.  Bias  of  Kaplan-Meier  Estimate,  Fx(t). 
A,  =  1 ,  Aj  =  15 . 
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the  bias  and  MSE  of  f,(/|  computed  under  the  depen¬ 
dence  model.  Under  this  model,  the  Kaplan-Mcicr 
estimator  is  equivalent  to  the  estimator  one  would 
obtain  based  on  n  observations  from  an  independent 
system  with  component  survival  distributions 
given  by  (5.1)  or,  if  k,  —  /.2,  by  (5.2).  Hence  from 
Kaplan  and  Meier  (1958),  the  variance  of  /',(/)  is  given 

,5J) 

Thus  from  (5.1)  and  (5.2),  the  bias  and  MSE  of  F,i t)  are 


=  '*0,  (5-4) 

and 

MSE  (A<«»  «(#,(«) -W 


+  n>(tf 


\dd,M  | 

nf7,(u)J  ’ 


r  >  0.  (5.5) 


The  estimator  is  not  consistent,  since  B(f^t))  is  inde¬ 
pendent  of  n  and  not  necessarily  zero.  Also,  MSE 
(F,(t))  consists  of  a  factor  that  depends  only  on  the 
model  error  and  is  free  of  sample  size  and  of  a  term 
that  tends  to  0  as  n  tends  to  infinity. 

Note  that  in  the  case  of  equal  component  lifetime 
distributions,  =  k2  *=  </>,  the  bias  determined  from 
(5.2)  and  (5.4)  simplifies  to 


B(A(0)  •  «-*{[!  +  4(1  -  e-*)1]1'2  -  1}.  (5.6) 


© 


Figure  6.  MSE  of  Kaplan-Meier  Estimate,  F,(t). 
kim1.Xi-15.n-W. 


Figure  7.  MSE  of  Kaplan-Meier  Estimate.  Fx{t). 
-1  ,k3-15.n  -50. 


In  the  general  case,  the  integral  in  (5.1)  needs  to  be 
evaluated  numerically.  The  bias  of  the  Kaplan-Meier 
estimator  was  calculated  for  various  values  of  and 
p.  A  representative  plot  of  the  bias  appears  in  Figure 
5,  where  =  1,  k2  —  1.5, and  Ip |  *=  0,  .125,  .250.  It  is 
apparent  that  the  bias  is  largest  for  values  of  t  in  the 
neighborhood  of  an  interval  that  captures  the  mean 
component  lifetimes.  The  absolute  value  of  the  bias 
ranges  from  0  to  .1 1  in  this  example. 

MSE  (fffl)  was  calculated  for  various  values  of  k„ 
n,  and  p.  Its  magnitude  is  typified  in  Figures  6  and  7, 
where  a,  =  1,  k2  =  1.5,  and  n  =  10,  50,  respectively. 
For  /.,  =  1,  k2  -  1.5,  and  n  =  cc,  MSE  (At))  may  be 
obtained  by  squaring  (5.4)  or  by  squaring  the  ordinate 
values  in  Figure  5.  The  MSE  of  the  Kaplan-Meier 
estimator  may  be  quite  large  for  small  sample  size  n 
and  moderately  large  for  “large"  p.  the  former  being  a 
more  crucial  factor  than  the  latter. 

6.  SUMMARY 

The  results  presented  here  show  that  for  the 
Gumbel  model,  one  may  be  misled  by  falsely  as¬ 
suming  independence  of  component  lifetimes  in  a 
series  system.  In  modeling  system  reliability  based  on 
complete  information  about  two  marginal  component 
life  distributions,  effects  of  erroneously  assuming  inde¬ 
pendence  of  component  lifetimes  is  most  pronounced 


TECHNOMETRICS  <*.  VOE  26  NO  3.  AUGUST  1984 


284 


M.  L.  MOESCHBERGER  AND  JOHN  P.  KLEIN 


for  system  reliabilities  smaller  than  .75.  For  system 
reliabilities  larger  than  .90,  this  effect  is  too  small  to  be 
of  practical  interest.  The  elTecis  of  a  departure  from 
independence  on  the  Mann-Grubbs  bounds  for  small 
sample  sizes  seems  to  be  negligible  for  confidence 
levels  greater  than  .90.  But  for  either  large  sample 
sizes  or  smaller  confidence  levels,  one  may  be  appre¬ 
ciably  misled. 

For  the  dual  problem  of  estimating  component  re¬ 
liability  based  on  data  from  a  series  system,  it  appears 
that  departures  from  independence  are  of  a  greater 
consequence.  Both  parametric  and  nonparametric  es¬ 
timators  of  relevant  component  parameters  are  incon¬ 
sistent.  Although  under  independence,  the  bias  of  the 
estimators  of  interest  clouds  the  issues,  it  is  clear  that 
for  larger  negative  correlations  these  estimators  tend 
to  underestimate  the  parameter,  whereas  for  large 
positive  correlations,  the  reverse  is  true. 
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Abstract 


The  problem  of  testing  for  independence  of  the  component  lifetimes 
when  the  components  are  linked  in  series  is  considered.  To  avoid  the 
problem  of  nonidentif lability  the  marginal  component  lifetimes  are  assumed 
to  be  known.  In  this  setting  a  modified  version  of  Kendall's  Tau  is 
proposed.  This  test  statistic  is  obtained  by  replacing  those  component 
lifetimes  which  cannot  be  observed,  due  to  system  failure,  by  conditional 
probabilities  computed  under  independence.  A  small  scale  simulation  study 
of  the  power  of  this  test  shows  the  test  has  reasonable  power  for  relatively 
small  sample  sizes. 


Key  Words :  Series  Systems;  Test  for  Independence;  Kendall's  Tau; 
Exponential  Distribution. 


1.  DEPENDENT  SYSTEMS 


A  ccrmon  assumption  made  in  modeling  series  systems  is  that  the 
component  lifetimes  are  statistically  independent.  This  assumption  is 
also  routinely  made  in  analyzing  data  collected  from  series  systems. 
Recently,  Klein  and  Moeschberger  (1983)  and  Moeschberger  and  Klein  (1984) 
have  shown  that  one  may  be  appreciably  misled  by  this  independence 
assumption  for  certain  bivariate  exponential  systems. 

To  illustrate  the  effects  of  this  independence  assumption  consider 
the  following  two  models  for  the  joint  survival  function  of  the  component 
lifetimes  (X,Y).  The  first  model,  due  to  Oakes  (1982)  has  joint  survival 
function 


H(x,y)  =  ?(X>x,  Y>y) 


-1 

re-i) 


6  >  1 


(1.1) 


where  F( • ) ,  G( • )  are  the  marginal  survival  functions  of  X  and  Y  respectively. 
This  distribution  has  a  coefficient  of  concordance  x  -  (9-1)/ (0+1)  and 
9  =  1  corresponds  to  independent  component  failure  times.  If  A(x| Y=y)  and 
A(x| Y>y)  denote  the  conditional  hazard  functiois  for  the  conditional 
distributions  of  X  given  Y  =  y  and  given  Y  >  y,  respectively  then 
A(x|Y=y)  =  6  A(x|Y>y). 

A  second  model,  due  to  Gumbel  (1960),  has  joint  survival  function 
H(x,y)  =  F(x)G(y)[l+a(l-F(x))(l-G(y))],  -  1  <  a  <  1.  (1.2) 

This  model  has  coefficient  of  concordance  x  =  2  a  /9  which,  unlike  the 
Oakes  model,  may  be  both  positive  and  negative. 

To  illustrate  the  importance  of  the  independence  assumption  in 
modeling  the  system  life  consider  figures  1  and  2  where  the  and  99t^ 


percentile  of  system  life  is  plotted  for  the  mwc  models  with  exponential 
marginals.  Here  in  all  cases  the  first  component  has  unit  mean  life.  For 
the  Gumbel  model  the  true  percentile  ranges  from  80%  to  115%  of  the 
percentile  computed  under  independence,  while  in  the  Oakes  model  the  true 
percentile  can  be  as  much  as  twice  as  big  as  one  percentile  computed  under 
independence  when  and  as  much  as  1.5  -ime  as  big  when  X^  =  2. 

Since  one  may  be  appreciably  misled  by  erroneously  assuming  independent 
component  lifetimes  it  is  desirable  to  test  this  hypothesis  based  on  data 
from  series  systems.  However,  if  no  assumptions  about  the  underlying 
distribution  of  the  component  lifetimes  is  made  such  a  test  is  impossible 
due  to  the  identif lability  problem  (see,  e.g.  Tsiatis  (1578),  Miller  (1977), 
Basu  and  Klein  (1982)).  This  is,  given  any  set  of  observable  information 
(such  as  system  life,  crude  system  life,  etc. 5  collected  from  a  series 
system  with  dependent  component  lifetimes,  there  exists  a  series  system 
with  independent  component  lifetimes  with  the  same  observable  information 
(see  Langberg,  Proschan  and  Quinzi  (1981)).  However,  this  comparable 
system  of  independent  random  variables  need  r.ot  have  the  same  marginal 
component  life  distributions  as  the  dependent  structure.  In  particular,  the 
marginal  distributions  of  the  two  systems  are  the  same  only  for  the  class  of 
constant  sum  models  defined  by  Williams  and  Lagakos  (1977). 

In  the  next  section  a  modification  of  Kendall's  (1533)  test  for 
independence  is  proposed.  This  test  assumes  chat  the  marginal  component 
life  distributions  are  completely  specified.  This  information  could  be 
obtained  by  testing  each  component  separately ,  as  is  of men  done  in  the 
development  stages  of  system  design  (see,  e.g.  Easterling  and  Prairie 
(1971),  Mastran  (1976),  or  Miyaroura  (1982)).  In  section  3  a  simulation 


study  compares  the  power  of  this  test  to  the  parametric  tests  based  on 
Oakes  and  Gumbel  models. 


2.  THE  TEST  PROCEDURE 


Suppose  that  n  two  component  series  systems  are  put  on  test.  I^t 
Xf,  Y^  denote  the  potential  (unobservable)  failure  times  of  the  first  and 
second  components  of  the  i^1  system.  We  are  not  allowed  to  observe 
(Xu  YO  directly,  but  instead  we  observe  T^  =  nin(X^,  Y^> ,  the  system 


failure  time  and 


|l  if  ,  the  cause  of  the  system  failure. 

S  0  if  \  =  Yi 


Also  suppose  that  the  marginal  survival  functions  of  X^  and  Y^,  F(x)  =  P(X^>x) 
and  £(y)  =  P(Y^>y),  i  =  l,...,n  are  known. 

If  we  could  observe  both  Xi  and  Yi  then  a  test  of  independence,  due 
to  Kendall  (1933).  is  to  count  the  number  of  concordant  pairs  and  the  number 
of  discordant  pairs.  A  pair  (Xp  YO,  (Xj,  YO  is  concordant  if  X.^  -  Xj 
and  -  Y-  have  the  same  sign  and  is  discordant  if  these  differences  have 
different  signs.  The  test  statistic  is  then  the  number  of  concordant 
pairs  minus  the  number  of  discordant  pairs. 

If  the  data  canes  from  a  series  system  then  only  T^,  1^  is  observed. 
Suppose  we  consider  a  pair  Clh,  1^),  (Tj,  Ij)  with  T^  <  Tj .  If  1^  =  1  and 
T.  =  1  then  we  know  that  ^  <  Xj  =  Tj ,  and  Xi  <  Yj_,  Xj  <  Y y  This 
pair  would  be  concordant,  regardless  of  the  value  of  Yj ,  if  T^  <  Y^  <  Tj. 

If  Y^  >  Tj  concordance  or  discordance  depends  cn  the  value  of  Yj.  Under  the 
null  hypothesis  of  independence,  the  conditional  probability  that  the 
pair  is  concordant  is  [G(T^)  -  G(Tj) ]/G(T^)  =  <  Y  <  Tj  |  Y  >  T^)  since 

average  concordance  over  the  range  Y  >  Tj  is  C.  When  1^=1  and  I.  =  0 


then  T 


i  =  Xi  <  Yj  =  Tj’  xi  <  Yi’  Here  Ti  ^  Yi  <  Tj  the  P3^ 


would  be  concordant  and  if  Y^  >  the  pair  would  be  discordant,  whatever 


the  value  of  X^.  Under  independence  the  conditional  probabilities  of  these 


two  events  are  CGCI1)  -  GCl\  )]/G(T^)  and  G(T.l  3(T^) ,  respectively.  Should 
1^=0  similar  probabilities,  involving  F,  could  be  obtained.  This 
motivation  suggests  the  following  score  function  for  Td  <  Tj 


♦CT^I^T-,^)  =  < 


f  [G(Ti)-G(T;j)]/G(Ti) 

if 

li  =  *3  =  1 

CF<t.)-F(t.)]/F(t*) 

/ 

if 

!.  =  I.  =  0 

j  C5(Ti>-26’(Tj )  ]/G(T.. ) 

if 

^  =  1,  Ij  =  0 

^  [F(Ti)-2F(T;. )]/?(!,) 

if 

ii  =  0,  I.  =  1 

and  similarly  for  >  Tj. 

The  modified  version  of  Kendall 1 s  test  statistic  is 


t=  l  l  )'(?). 

l<i<3<n  1  1  J  -  " 


(2.1 


(2.2 


To  find  the  moments  of  T,  under  independence,  consider  the  pairs 


(Tx,  L^,  (T2,  I2).  Let  Ax  =  {Tx  <  Tj,  I±  =  I2  =  1}, 

Aj  *  {Tx  <  T2,  Ix  =  1,  I2  =  0},  A3  =  nx  <  T2,  Ix  =  0,  I2  =  0}  and 

A4  =  {^  <  T2,  =  0,  I2  =  1}.  In  terns  of  the  unobservable  component 

lifetimes,  (Xi,  Y^),  A1  =  {X1  <  Xj,  X1  <  Y^,  Xj  <  Y2), 

Aj  =  {Xj^  <  Y2,  ^  <  Yx,  Y2  <  X2J,  A3  =  (Yx  <  Y2,  Yx  <  XL,  Y2  <  X^,  and 

\  =  {Y^  <  X2,  Y^  <  X^,  X2  <  Y2).  Since,  by  r. —retry  T^  is  equally  likely 

to  be  either  smaller  or  larger  than  T„  we  have 


f  G(x, )-G(x~) 

=  - = - - - dF(:<1)cF(x2)dG(y1)dGCy2) 

2  1  1  L  L  G(x. ) 


GCx1)-2G(y2) 


\2  G(x^) 


dF(x1)dF(x2)dG(y1)dG(y2) 


+  f  F(yl)"F(Vl)-  dF(x1)dF(x,)dG(y1)dG(y2) 

JAj  T(yi)  1  (.2.2) 

(  F(y1)-2F(x,)  , 

+  - 1 - —  dF(x,)dF(x2)dG(y1)dG(y2). 

\  nVl) 

=  Jl  +  J2  +  J3  +  J4  (say). 

Now,  consider 

J  +  J  =  r  {f  CG(x)-G(y)]G(y)dF(y)  +  r[G(x)-2G(y)]F(y)dG(y)}dF(x) .  (2.4 

1  2  J  _<»  J  x  x 

Integrating  the  first  inner  integral  in  (2.4)  by  parts  yields  the  negative 
of  the  second  inter  integral  so  that  +  J2  =  0.  Similar  computations  sno.-. 
that  J3  +  J4  =  0-  Thus  E(*(T1$  Iv  T2,  I2))  and  hence  E(t)  are  both  0. 

By  similar  computat  ions  one  can  show  that 

n(n-l)V(t)  =  |  |  G(x)2dF(x)  +  |  j  F(x)2dG(x) 

-  4  f  G(x)"1|F(y)G(y)2dG(y)dF(x) 

J-00  JX 

C°°  r°°  _ 

-  4  F(x)_1  G(y)F(y)ZdF(y)dG(x) 

J_W  ->X 

+  4(n-2)  {  ^  -  2 J  F(x)G(x)d(F(x)  +  G(x))  (2. 

-  2f°  F(x)2G(x)2d(F(x)  +  G(x)) 

'  .OO 

+  3J00  F(x)2G(x)dF(x)  +  3j"  F(x)G(x)2dG(x)} ,  where  F(x)  =  1  -  *( 

And  G(x)  =  1  —  (i(x) 


The  asymptotic  normality  of  t  follows  by  the  results  of  Hoeffdir.g  (1948). 
Hence,  a  test  of  independence  versus  dependence  rejects  if  jx//'(T) j  is 
greater  than  the  appropriate  percentage  point  of  a  standard  normal  random 
variable.  A  test  of  independence  versus  positive  dependence  rejects  if 
is  too  large. 

A 

The  variance  of  t  (2.5)  can  be  expressed  explicitly  in  several  cases. 


Case  1.  F(x)  =  G(x).  In  this  case  (2.5)  reduces  to 


V(t)  = 


3Cn(n-l)  * 


(2.5) 


Case  2.  (Lehmann  structure)  F(x)  =  G(x)a.  Here  (2.S)  reduces  to 


-i'vw/Os  _  8a[35a  +  n(Sa2+2a+9)] 
n(n-l)V(T)  -  3(3a-1y(3+cl)(2a+3)(33 


3(3a+l)(3+a)(2a+3)(3a+2)  * 

Case  3.  (X,  Y  exponential),  F(X)  =  e~*x,  G(y)  =  then  (2.5) 


(2.7) 


reduces  to 


/  Tivi'rl  -  8l6[3S.  6  +  n(9/.  +2a6+9v~~ )  j 
mn  xjvkij  -  ,, •n+ft So xXR'i U? i+?co 


(2.8) 


When  the  true  values  of  F,  G  are  ris specified  then  E(t)  is  not  zero. 

—  -s—  O 

If  the  true  component  lifetime  distributions  are  F,  G”  but  r  ,  are  used  in 
formula  (2.3)  then  one  can  show  that,  under  independence. 


=  2 <  1— B ) f  G(x)1"^  f  F(y)S'(y)SdG(y)d?(x) 

+  2(l-a) f  f(x)1m:i  f  S’(y)F(y)CtdF(y)d3(x),  a,  S  >  0. 

*  -oa  *  v» 


(2.9) 


If  F(-)  =  G(-)  then  ECO  =  8-1  +  u-1 

HH+TJ  TUsT) 

If  F(x)  =  G(x)®  then  E(t)  -  8  j  a-1  +  S-l  j 

(e+i)  |0+uf+i  5+f+rf 

Similar  expressions  can  be  obtained  for  the  null  variance  or  t. 


a 


3.  SIMULATION  STUDY 


To  study  the  effectiveness  of  the  -.edified  Kendall's  t  described  in 
section  2  a  simulation  study  was  conducted.  The  study  was  performed  by 
genera tins  1000  samples  of  n  =  20  or  4G  series  systems  with  exponentially 
distributed  component  life  times,  FCx)  =  e-x  and  G(y)  =  e  ^2^5  x2  =  1,  2. 

Both  the  Oakes  joint  distribution  G.l)  and  the  bivariate  Gumbel  distribution 
(1.2)  were  used.  The  bivariate  observations  from  the  Oakes  distribution 
were  generated  using  the  technique  described  in  section  2  of  that  paper. 

To  generate  Gumbel  random  variables  with  marginal  survival  functions 
F(x),  G(y)  let  U]_,  U2  be  independent  uniform  randan  deviates.  Note  that 

F(x|y)  =  P(X>x| Y=y)  =  F(x)(l+a-2aT(y))  -  aF(x)2(l-2G(y)).  (3.1) 

Let  =  §"(y)  and  U2  -  F(x|y)  =  F(x)[l-Ki~2cU1  ]  -  aF(x)2(l-2U1). 

Solving  this  equation  for  F(x)  yields 

9  o  1/9 

F(x)  =  U*  =  (l+n(l-2U1))  -a(l+a  (1-iU- )‘  +  2a(l-2U1)(l-2U2))  (3.2) 

~  2e(l-2U1)  ,  Ux  i  1/2 

which  is  the  root  which  lies  in  the  interval  [0, 1].  If  U^  =  1/2  then  U*  =  U2- 

The  pair  (X,  Y)  is  then  found  by  X  =  F~“('J*),  Y  =  G  ^(U^). 

For  the  purpose  of  comparison  the  parametric  tests  for  independence, 

based  on  the  efficient  scores  statistics,  for  the  Gumbel  and  Oakes  model 

were  obtained.  Consider  first  the  Gumbel  model  (1.2).  Using  the  notation 

in  section  2,  the  observable  crude  density  for  I  =  1  is 

-d  P(T>t,I=l)  =  qx(t)  =  f (t )G(t ) [ 1+i ( 1-F (t ) -2(1  (t )  +  F(t)G(t))] 

whereof (t)  =  -d  F(t),  and  a  similar  expression  for  qQ(t)  when  1=0.  Based 
dt 

on  a  sample  of  n  series  systems  the  like  1  mood  function  is 

n  I.  l-l z 

L(cO  «  n  q-^t.)  -  sQ(t.)  J 


(3.3) 


z  ^.^wy +  (i-ij)x?Ti +  ZjWj  -  wv  )•  xi :  x2 


(2nX XX2) 


1/2 


which  is  approximately  standard  normal  for  large  n  when  ft  =  1. 

The  results  of  this  study  are  reported  in  table  1.  From  this  table 
it  seems  like  the  modified  t  test  has  reasonably  good  power  when  compared 
to  the  parametric  tests,  although  comparison  with  the  Oakes  score  test  is 
hard  since  the  significance  level  of  that  test  is  inflated.  Also  the  test 
based  on  the  Gumbel  scores  has  comparable  power  when  the  data  is  from  the 
Oakes  model.  A  test  for  normality  done  on  the  samples  where  the  components 
were  independent  accepted  the  normality  assumption  for  the  modified  x  test. 

Table  2  reports  the  observed  number  of  rejections  when  the  component 
parameters  are  estimated  based  on  independent  samples  of  size  50  for  each 
component.  A  .05  significance  level  was  used.  Here,  when  =  A2,  all  tests 
have  inflated  levels.  When  *  X2  the  tests  are  conservative.  All  tests 
nave  comparable  power  wnen  X-^  =  X2,  however  the  modified  x  test  has  signi¬ 
ficantly  higher  power  when  X ^  i  X^ . 

A 

In  addition  to  the  power  of  our  modified  test  the  E(t)  was  estimated 
for  each  sample.  Except  in  the  independence  case  the  simulation  showed 
that  E(t)  =  .35t,  suggesting  x  is  of  limited  use  as  a  point  estimator  of  x. 
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I. 


INTRODUCTION 


A  common  problem  in  survival  analysis  is  to  estimate  the  marginal 
survival  function  of  the  time,  x»  until  some  event  such  as  remission, 
component  failure,  or  death  due  to  a  specific  cause  occurs.  Often  obser¬ 
vation  of  this  main  event  of  interest  is  impossible  due  to  the  occurrence 
of  a  competing  risk  at  some  time  Y  <  X,  such  as  censoring,  failure  of  a 
different  component  in  a  series  system,  or  death  from  some  cause  not 
related  to  the  study.  Standard  statistical  methods,  which  assume  these 
competing  risks  are  independent,  estimate  the  marginal  survival  function 
by  the  Product  Limit  Estimator  of  Kaplan  and  Meier  (1958).  This  estimator 
has  been  shown  to  be  consistent  for  the  marginal  survival  function  by 
Langberg,  Proschan  and  Quinzi  (1981)  when  the  risks  follow  a  constant 
sum  model  defined  by  Williams  and  Lagakos  (1977)  .  When  the  risks  are 
not  in  the  class  of  constant  sum  models,  the  Product  Limit  Estimator 
is  inconsistent  and,  in  such  cases,  the  investigator  may  be  appreciably 
misled  by  assuming  independence. 
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In  the  competing  risks  framework  we  observe  T  =  minimum(X,Y)  and 
I  *  X(X  ^  Y)  where  X(')  denotes  the  indicator  function.  Tsiatis  (1975) 
and  others  have  shown  that  the  pair  (T,I)  provides  insufficient  information 
to  determine  the  joint  distribution  of  X  and  Y.  That  is,  there  exists  both 
an  independent  and  a  dependent  model  for  (X,Y)  which  produces  the  same 
joint  distribution  for  (T,I).  However,  these  "equivalent"  independent 
and  dependent  joint  distributions  may  have  quite  different  marginal 
distributions.  Also,  due  to  this  identif iability  problem,  there  may  be 
several  dependent  models  with  different  marginal  structures  which  will 
yield  the  same  observable  information,  (T,I).  In  light  of  the  consequences 
of  the  untestable  independence  assumption  in  using  the  Product  Limit 
estimator  to  estimate  the  marginal  survival  function  of  X,  it  is  important 
to  have  bounds  on  this  function  based  on  the  observable  random  variables 
(T,I)  and  some  assumptions  on  the  joint  behavior  of  X  and  Y. 

Peterson  (1976)  has  obtained  general  bounds  on  the  marginal  survival 
function  of  X,  S(x),  based  on  the  estimable  joint  distribution  of  (T,I). 

Let  Qj^(x)  -  P(T  >  x,  1  =  1),  and  Q2(x)  =  P(T  >  x,  I  =  0  )  be  the  crude 
survival  functions  of  T.  His  bound,  obtained  from  the  limits  on  the  joint 
distribution  of  (X,  Y)  obtained  by  Frechet  (1951),  is 


Qj,  (x)  +  Q2(x)  <  S (x)  <.Qx(x)  +  Qx(0) 


(1.1) 


Since  these  bounds  allow  for  any  dependence  structure,  they  can  be  very  wide 
and  provide  little  useful  information  to  an  investigator. 

Fisher  and  Kanarek  (1974)  have  obtained  tighter  bounds  on  S(x)  in 
terms  of  a  dependence  measure  a.  Their  model  assumes  that  simultaneous 


'  V  v'  <•  •  *  •  • 


!.! 
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Co  che  occurrence  of  Y  an  evenc  occurs  which  eicher  sCretches  or  concraccs 

Che  remaining  life  of  X  by  an  amounc  associaCed  wich  a.  Thac  is, 

P(X  >  x|Y  =  y  <  x)  =  P(X  >  y  +  oc(x-y)|Y  >  y  +  a(x-y)).  A  large  a,  for 

example,  implies  chac  a  small  survival  after  censoring  is  che  same  as  a-cimes 

as  much  survival  if  censoring  was  noC  present.  They  show  that  if  a  is  assumed 

known,  then  the  marginal  survival  function  can  be  estimated  from  the 

observable  information.  Also  these  estimates,  Sa(x),  are  decreasing  in  a. 

For  their  bounds,  the  investigator  specifies  a  range  of  possible  values 

aL  <  a  <  au  so  that  (x)  <_  S(x)  <_  S  (x) . 

u  L 

Recently,  Slud  and  Rubenstein  (1983),  have  proposed  general  bounds. 

They  show  that  knowledge  of  the  function 


P  (x) 


lim 

S-0 


P(x  <  X  <  x+<5  X  >  x,  Y  <_  x) 
P(x"<  X  <  x  +6|X  >  x ,  Y  ’>  x) 


along  with  the  observable  information  (T,I)  is  sufficient  to  uniquely 
determine  Che  marginal  distribution  of  X.  These  estimates  Sp(t)  are 

decreasing  functions  of  p  for  fixed  x.  Their  bounds  are  obtained  by 
specifying  a  range  of  possible  values  p^(x)  p(x)  <_  P2(x)  so  that  if 
p(x)  is  the  true  function  §p2(x)  <_  S(x)  <_  Sp^(x). 

In  this  paper  we  obtain  different  bounds  on  the  marginal  survival 
function  by  assuming  a  particular  dependence  structure  on  X  and  Y.  These 
bounds  are  functions  of  the  observables  (T,I)  and  a  familiar  dependence 
measure,  Che  concordance  probability  between  X  and  Y.  In  Section  2  we 
describe  this  model  in  detail.  In  Section  3  we  derive  the  bounds  and  show 
consistency  when  the  dependence  parameter  is  known.  In  section  4  these 
bounds  are  compared  to  those  obtained  by  Peterson,  Fisher  and  Kanarek, 


and  Slud  and  Rubenstein. 


II.  THE  MODEL 


The  dependence  structure  we  shall  employ  to  model  the  joint  survival 
was  first  introduced  by  Clayton  (1978)  to  model  association  in  bivariate 
lifetables  and,  later,  by  Oakes  (1982)  to  model  bivariate  survival  data. 
Let  S(x)  -  P(X  >  x),  R(y)  =  P(Y  >  y),with  S(0)  =  R(0)  =  1,  be  the 
continuous  univariate  survival  functions  of  the  death  and  censoring  times, 
respectively.  For  9  1  define  F(x,y)  =  P(X  >  x,  Y  >  y)  by 


F(x,y) 


[{ 


1  }0-l 


S(x) 


.  ,  1,9-1  ,-1/ (0-1) 

lR(y) *  J 


(2.1) 


This  joint  distribution  has  marginals  S  and  R.  As  9-*T ,  then  (2.1)  reduces 
to  the  joint  distribution  with  independent  marginals.  For  0-«*>,  F(x,y)  -*• 
min(S(x) ,  R(y))  the  bivariate  distribution  with  maximal  positive  association 
for  these  marginals.  The  probability  of  concordance  is  0/(0  +  1)  so  that 
Kendall's  (1962)  coefficient  of  concordance  is  x  =  (0  -  l)/(0  +  1)  which 
spans  the  range  0  to  1. 

This  model  has  a  nice  physical  interpretation  in  terms  of  the 
functions  A(x|y  »  y)  and  A(x|Y  >  y) ,  the  hazard  functions  of  X  given  Y  =  y 
and  X  given  Y  >  y,  respectively.  From  (2.1)  one  can  show  that 

A (x | Y  -  y)  =  0A(x|Y  >  y) 
or 

P(X  >  x | Y  =  y)  =  [P (X  >  x|Y  >  y )  ]  ?  (2.2) 

For  0  > 1  the  hazard  rate  of  survival  if  censoring  occurs  at  time  y  is 
0  times  the  hazard  rate  of  survival  if  censoring  does  not  occur  at 


time  y.  This  implies  that  the  hazard  rate  after  censoring  occurs  is 


accelerated  by  a  factor  of  0  over  the  hazard  rate  if  censoring  had  not 
occurred.  Also  when  9*1,  (2.2),  reduces  to  the  condition  required  by 
Williams  and  Lagakos  (1977)  for  a  model  to  be  constant  sum  and  hence  for 
the  usual  product  limit  estimator  of  S(t)  to  be  consistent  (See  Basu  and 
Klein  (1982)  for  details). 


Oakes  (1982)  also  shows  that  (2.1)  can  be  obtained  from  the  following 

1  ft  —  1 

random  effects  model.  Let  S*(x)  =  exp  {-  ]  +1}  and  leC  R*(y)  be 


similarly  defined.  Let  W  have  a  gamma  distribution  with  density 

In  -i  _w 

g(w)aw°  e  and  conditional  on  W  *  w  let  X,Y  be  independent  with 


survival  functions  {S*(x) }w  and  {R*(y)}w.  Then,  unconditionally,  X,Y  have 
the  joint  survival  function  F(x,y)  given  by  (2.1). 

For  fixed  marginals  S  and  R  Che  joint  probability  density  function, 
f(x,y),  can  be  shown  to  be  totally  positive  of  order  2  for  all  0  1. 

This  implies  that  (X,Y)  are  positive  quadrant  dependent.  In  particular, 
one  can  show  that  for  S,R  fixed  the  family  of  distributions 
F  *  (F(x,y):  0  >_  1 }  is  increasing  positive  quadrant  dependent  in  0  as 


defined  by  Ahmed,  et  al .  (1979). 


III.  BOUNDS  ON  MARGINAL  SURVIVAL 

Suppose  that  X  and  Y  have  the  joint  distribution  (2.1)  and  let 
T  =  min  (X,Y),  then  the  survival  function  of  T  is 


F(T)  =  ([ 


1  ,0-1 


1  ,8-1 


S(t) 


[R(t)] 


-  1] 


1 

0-1 


(3.1) 


and  the  crude  density  function  associated  with  X, 


qj(t)  =  P(T  <  t,  X  <  Y),  is  given  by 

q,(t)  =  Zip-  (F(t)]9,  (3.2) 

S9(t) 

where  s(t)  =  -dS(t)/dt. 

Now  consider  the  differential  equation 


s(t)/S9(t)  =  qi(t)/IF(c)f 


(3.3) 


and  suppose  0  is  known.  Then  the  solution  of  (3.3)  for  S(t)  is 


S9(c)  . 


'l(“>  ]. 


1  +  (9-1)  l  ,1<U)  • 

[F 


if  0  >  1 


(3.4) 


=  exp(  -  / 


t  qx(u) 


F(u) 


du) 


if  8  =  1. 


The  functions  F(>)  and  q^(’)  are  directly  estimable  from  the  data  one 
sees  in  a  competing  risks  experiment.  Let  T^ ,  ...,  Tr  denote  the  observed 
test  times  of  n  individuals  put  on  test  and  let  1^,  i  =  1,  ....  n  be  1  or  0 


according  to  whether  the  T.  was  an  observation  on  X.  or  Y.,  respectively. 


7 


Define  F(t)  =  Ix(T.  >  c)/n  and  Q  (t)  =  t  x  (T .  t ,  I .  =  1) . 

i*l  i  =  l  1  3 


Then  if  0  is  known,  a  natural  estimator  of  S.(t)  is 

o 


t  dQ1(u) 


V‘>  -  [1,  (6-1)  ,  (?Mfl 


exp 


t  dQ  (u) 

(  _  j  —  i -  ) 


]"  (0-1)  if  0  >  1 


if  0  =  1 


(3.5) 


0  F(u) 


For  0=1,  this  estimator  is  of  the  form  of  the  hazard  rate  estimator 
proposed  by  Nelson  (1972).  The  estimators  (3.5)  can  be  expressed  in 
the  following  form  for  computation  purposes. 


sQ(t)  = 


exp 


I  (n-i+1) 
T(i)-ti’I(i)  =  1 


if  e  =  i 


(3.6) 


[1  +  (9-l)n 


0-1 


l  (n-i+1)  0]  "  9-1 


T(i)<C’I(i)=1 


if  0  >  1 


where  T,,.,  ....  T,  .  are  the  ordered  death  times. 

(i J  in; 

For  0  known  and  if  the  true  underlying  joint  distribution  of  (X,Y) 


is  of  the  form  (2.1)  then  S0(t)  is  a  consistent  estimator  of  S(t)  as  shown 
by  the  following  theorem. 


Theorem  1 .  Let  (X,Y)  have  the  form  (2.1)  with  marginals  S(t),  R(t) 
respectively.  Let  0  >  1  be  known.  Then  on  the  set  where  S(t)  >  0  we  have 
S0(t)  -  S(t)  a.s. 


<  tu 


For  8=1,  Che  result  follows  by  a  theorem  of  Langberg,  Proschan  and 

Quinzi  (1981).  Suppose  that  0  >  1.  Note  that  Q^(t)  -*•  Q^(t)  a.s.  and 

(u)  -*■  F(u)  a.s.  by  the  strong  law  of  large  numbers.  Since  Sg(t)  is  a 

C  dQ  (u) 

continuous  function  of  ^  „  in  the  support  of  F(u),  it  suffices  to  show 

0  (F(u)]6 

C  dQ^u)  C  dQ^u) 

I  -  -*■  1  - o-  a.s. 

0  (F(u)f  0  [F  (u)]° 

Now,  after  an  integration  by  parts. 


^  dQ^u) 

“  § 
0  [F(u)]° 


Ql(0  z 

[F(t)]9  0 


Q  iOOd^-e 

Fe(u) 


) 


Q^U) 

[F(t)]9 


c  ~  1  c  1 

l  tQ,(u)  -  Q  (u)]d(xQ  )  +  /  (^(ujd^  ) 

0  F  (u)  o  F  (u) 


§1 (c)  -  Qx(t) 
[F  (u)]0 


1  [Q,(u)  -  Q. (u) Jd(4fi  ) 
0  1  FU(u) 


C  dQ  (u) 
+  ' 

0  F  (u)° 


(3.7) 


By  the  dominated  convergence  theorem 


9 


Urn  Vc>  - 

n~  If  (u)]9 


0.  a .  s .  , 


and 

A 

lim  sup  { |Q  (u) 
n-*00 


Q^(u) | }  =  0,  a.s. 


Hence,  applying  Che  above  results  Co  (3.7),  Che  result  now  follows:  // 

To  obtain  bounds  on  Che  net  survival  function  based  on  data  from  a 

competing  risks  experiment,  we  proceed  as  follows.  First,  note  that  from 

(3.5)  it  is  true  that  §a(t)  is  a  decreasing  function  of  9  for  fixed  t. 

u 

+  /.  C  -'-l  /V 

Also,  as  9  -*■  1  we  have  Sg(t)  f  exp  (-  j  F  (u)dQ^(u)). 

0 

which  provides  an  upper  bound.  Notice  that  this  upper  bound  corresponds 

A  A 

to  an  assumption  of  independence.  As  9  -*•  00  one  can  show  that  SQ(t)  1  F(t) 

o 

which  corresponds  Co  Peterson's  (1976)  lower  bound. 

In  practice  the  above  bounds,  with  9  =  1 ,  °°,  while  shorter  than 
Peterson's  bounds,  may  still  be  quite  wide. 

Tighter  bounds  may  be  obtained  by  an  investigator  specifying 
a  range  of  possible  values  for  9.  If  the  sample  size  is  sufficiently 
large  and  9^  £  9  <_  62,  then  Sg  (t)  <  S(t)  _<  Sg  (t).  Specifying  9^,  9^ 

is  equivalent  to  specifying  a  range  of  values  <  t  <  c^e 

coefficient  of  concordance  T  since  9  =  (1  +  t )  /  (1  — T) .  Hence  the  primary 
value  of  Sq ( c )  is  in  putting  bounds  on  S(t)  rather  than  on  estimation  of 


S(t)  . 


IV.  EXAMPLE  AND  COMPARISONS 


To  illustrate  the  bounds  obtained  in  the  previous  section,  consider 
the  mortality  data  reported  in  Hoel  (1972) .  The  data  was  collected  on  a 
group  of  RFM  strain  male  mice  who  were  subjected  to  a  dose  of  300  rads  of 
radiation  at  age  5-6  weeks.  There  were  three  competing  risks,  thymic 
lymphoma,  reticulutum  cell  sarcoma,  and  other  causes  of  death.  For 
illustrative  purposes  we  consider  reticulum  cell  sarcoma  as  the  risk  of 
interest . 

Table  1  reports  the  value  of  Sa(t)  for  concordance  x  *  (0  -  1 ) / (0  +  1) 

A 

The  value  of  Sg(t)  at  T  =  0  corresponds  to  Nelson's  (1972)  hazard  rate 
estimator  assuming  independence.  Peterson's  upper  and  lower  bound 
(x  *  1)  are  also  reported  as  are  Fisher  and  Kanerek's  bounds  and  the  Slud 
and  Rubenstein  bounds  for  several  values  which  reflect  a  positive 
association  between  risks. 

From  Table  1  we  first'  note  that  Peterson's  bounds  are  very  wide. 
Substantial  improvement  is  obtained  if  one  assumes  a  non-negative 
dependence  structure  between  risks  (See  Table  2).  Further  tightening 
of  these  bounds  is  achieved  by  assuming  that  x  is  in  the  range  0  to  .5 
where  the  width  of  the  boundaries  is  at  most  about  50%  of  that  of  Peterson' 
bounds. 

Substantial  improvement  in  the  general  bounds  is  also  obtained  by 
the  bounds  of  Fisher  and  Kanerek  or  Slud  and  Rubenstein.  The  bounds  of 
Fisher  and  Kanerek  assume  a  specifi:  censoring  pattern  and  require  a 
specification  of  a  stretching  constant  a.  Without  some  additional  informa¬ 
tion,  such  specification  may  be  impossible.  Slud  and  Rubenstein's  bounds 
are  for  the  general  dependence  structure.  Their  bounds  require  the 
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specification  of  Che  p(t)  function.  This  function  is  a  quantity  which 
is  not  easily  conceptualized  by  investigators  from  either  a  statistical 
or  biological  perspective.  This  makes  it  questionable  whether  reasoable 
upper  and  lower  bounds  on  p(t)  can  be  extracted  from  one's  prior  beliefs. 
The  major  advantage  of  the  bounds  printed  in  this  paper  is  that  they 
require  only  the  specification  of  an  upper  and  lower  concordance,  a 
measure  quite  familiar  to  most  investigators  and  easily  explainable  to 
nonstatisticians. 


Table  2 

RELATIVE  SIZE  OF  THE  BOUNDS  ON  NET  SURVIVAL 
FOR  AN  ASSUMED  DEPENDENCE  STRUCTURE 
AS  COMPARED  TO  PETERSON'S  BOUNDS 
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ABSTRACT 

When  there  is  extreme censoring  on  the  right,  the  Kaplan-Meier 
product  limit  estimator  is  known  to  be  a  biased  estimator  of  the 
survival  function.  Several  modifications  of  the  Kaplan-Meier 
estimator  are  examined  and  compared  with  respect  to  bias  and  mean 
squared  error. 


Key  words:  Adjusted  Kaplan-Meier  survival  estimation. 
Bias  of  survival  function,  Life-testing, 
Survival  analysis,  Right  censoring 


1.  Introduction 


In  human  and  animal  survival  studies,  as  well  as  in 
life-testing  experiments  in  the  physical  sciences,  one 
method  of  estimating  the  underlying  survival  distribution 
(or  the  reliability  of  a  piece  of  equipment)  which  has 
received  widespread  attention  has  been  the  Kaplan-Meier 
product  limit  estimator  (Kaplan  and  Meier,  1958) . 

For  the  situation  in  which  the  longest  time  an  individual 
is  in  a  study  (or  on  test)  is  not  a  failure  time,  but  rather 
a  censored  observation,  it  is  well  known  that  there  are  many 
complex  problems  associated  with  any  statistical  analysis 
(Lagakos,  1979).  In  particular,  the  Kaplan-Meier  product 
limit  estimator  is  biased  on  the  low  side  (Gross  and  Clark* 

1975) .  In  the  case  of  many  censored  observations  larger 
than  the  largest  observed  failure  time  this  bias  tends  to  be 
worse.  Estimated  mean  survival  time  and  selected  percentiles, 
as  well  as  other  quantities  dependent  on  knowledge  of  the 
tail  of  the  survival  function,  will  also  exhibit  such  biases. 

A  practical  situation  which  motivates  this  study  is  a 
large-scale  animal  experiment  conducted  at  the  National 
Center  for  Toxicological  Research  (NCTR)  where  mice  were  fed 
a  particular  dose  of  a  carcinogen.  The  goals  of  this  experiment 
were  to  assess  the  effects  of  the  carcinogen  on  survival  and 
on  age-specific  tumor  incidence.  Towards  this  end,  mice 
were  randomly  divided  into  three  groups  and  followed  until 
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death  or  until  a  prespecified  group  censoring  time  (280,  420, 
or  560  days)  was  reached  at  which  time  all  those  still  alive 
in  a  given  group  were  sacrificed.  Often  there  were  many 
surviving  mice  in  all  three  groups  at  the  group's  sacrifice 
time. 

In  general,  we  consider  an  experiment  with  n  individuals 

under  study  and  censoring  is  permitted.  Let  t^j,...,  t^j 

denote  the  m  ordered  failure  times  of  those  m  individuals 

whose  failure  time  is  actually  observed  (t^  <  tj2>  <  ...  < 

t  ^ ) .  The  remaining  n-m  individuals  have  been  censored  at 

various  points  in  time.  It  will  be  useful  to  introduce  the 

notation  S .  =  number  of  survivors  just  prior  to  time  t, . , 

3  (D )  / 

i.e.,  S.  is  the  number  of  individuals  still  under  observation 
3 

at  the  time  t  ^ j  ^ ,  including  the  one  that  died  at  t  ^ ^  ^ .  Then 
the  Kaplan-Meier  product  limit  estimator  (assuming  no  ties 
among  the  t^'s)  of  tlie  underlying  survival  function, 

P(t)  *  P  (T>t),is 

r  l 

p(t)  =J  j 

)  n  (s.  -i)/s 
I  i=i  1 


o 


for 


t-t (m+1) 


3 


for  j=l,...,m,  where  t(m+u=tc  ^  the  longest  time  an 
individual  is  on  study  is  a  censoring  time  or  t (m+i)=  ®  if 
the  longest  time  an  individual  is  on  study  is  a  death. 

This  paper,  first,  proposes  in  Section  2  some  methods  of 
"completing"  the  Kaplan-Meier  estimator  of  the  survival 
function  by  i)  replacing  those  censored  observations  that 
are  larger  than  the  last  observed  failure  time  by  their 
expected  order  statistics,  ii)  using  a  Weibull  distribution 
to  estimate  the  tail  probability,  P(t),  t>tc»  and  iii) 
employing  a  method  suggested  by  Brown,  Hollander  and  Korwar 
(BHK)  (1974).  The  second  purpose  is  to  demonstrate  the 
magnitude  of  the  bias  and  mean  squared  error  (MSE)  of  the 
Kaplan-Meier  estimator  and  to  compare  all  methods  of 
"completing"  F(t),  in  the  context  of  the  aforementioned 
mouse  study,  utilizing  simulated  lifetimes  from  exponential, 
Weibull,  lognormal,  and  bathtub- shaped  hazard  function 
distributions.  These  results  are  presented  in  Section  3. 

2.  Completion  of  Kaplan-Meier  Product-Limit  Estimator 
2.1  Expected  Order  Statistics 

One  method  of  attempting  to  "complete"  p  (t) ,t>t  ,  would 
be  to  "estimate"  the  failure  times  for  those  censored  obser¬ 
vations  which  are  larger  than  the  longest  observed  lifetime. 
Let  nc  be  the  number  of  censored  observations  larger  than 
tjmj.  A  theorem  regarding  the  conditional  distributions  of 
order  statistics  states  that  for  a  random  sample  of  size  n 
from  a  continuous  parent,  the  conditional  distribution  of 
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T(u),  given  T(n_n  ,  -  t,  ,,  u>n-n0>  is  just  the 

c  c 

distribution  of  the  (u-n  +  n  ) th  order  statistic  in  a  sample 

C 

of  size  nQ  drawn  from  the  parent  distribution  truncated  on 

the  left  at  t  *  t,  .  (David  1981,  p.  20). 

(n-nc) 

For  computational  purposes ,  take  t  as  am  estimate  of 

w 

the  (n-n  )th  order  statistic.  Then  find  the  expected  value 

G 

of  the  nc  order  statistics  from  the  parent  distribution 

truncated  on  the  left  at  t  .  Since  the  Weibull  distribution 

c 

_  v 

with  survival  function  P(t)  *  exp(-t  /0)  has  been  widely 
accepted  as  providing  a  reasonable  fit  for  lifetime  data, 
it  seems  reasonable  to  employ  the  results  of  Weibull  distri¬ 
bution  theory  to  complete  P  (t)  ,  t>tc-  (It  should  be  noted 
that  any  distribution  which  is  reasonable  for  the  specific 
situation  may  be  used.)  The  expected  values  of  Weibull 
order  statistics  up  to  sample  size  40  for  location  parameter 
=  1  and  shape  parameter  =  0.5  (0.5)  4  (1)8  may  be  found  in 
Harter  (1969) .  For  larger  sample  sizes  he  states  a  re¬ 
currence  relation  which  may  be  used. 

To  compute  expected  values  of  the  nc  order  statistics  in 
question,  values  for  k  and  0  must  be  chosen.  One  approach  is 
to  use  the  maximum  likelihood  estimators,  k  and  0, computed 
by  using  all  observations  to'  estimate  k  and  0.  A  second 
approach,  due  to  White  (1969)  ,  uses  least  squares  estimates 
of  k  and  0  obtained  by  fitting  the  model 


In  (t(j))=(l/k)ln0+(l/k)ln  H(t{j)) 


(2) 
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to  the  tjjj's  where  H(t^)  is  the  estimated  cumulative 
hazard  rate  at  t^  obtained  from  the  Kaplan-Meier  estimator. 
Based  on  our  Monte  Carlo  study  we  found  the  maximum  likeli¬ 
hood  estimators  performed  better  in  all  cases  than  did  the 
least  squares  estimators.  Consequently,  the  method  of  least 
squares  will  be  dropped  from  future  discussion  in  this  paper. 
The  survival  function  for  a  Weibull  random  variable, 

truncated  on  the  left  at  t  ,  is 

c 

PT(t)  =  exp  {-(tk  -  tk  )/0>  ,  t>tc.  (3) 

So,  by  the  theorem  on  order  statistics  stated  at  the  be¬ 
ginning  of  this  section,  the  conditional  distribution  of 

T(„  ,  given  T  ,-t(  ,  (a  -n-n=  *1 . n)  will  be 

c  c 

approximated  by  the  (u-n  +  n  )th  order  statistic  in  a  sample 

C 

of  n,  drawn  from  (3).  For  simplicity,  let  j  =  u  -  n  +  n  , 
c  c 

so  that  j  =  l,...,nc. 

Now  the  expected  value  of  the  jth  order  statistic  from  (3)  is 
E(Tj:n)  =  nc  f  °  V"  t{PT(t)J  j_1{PT(t)  }nc  "j+1(ktk"1/9)dt 

=nc|  C  j  /”  (yk+tk  )  1/k  (P(y) }j"1{P(y) }nc"j+1(kyk"1/Q)dy  (4) 
where  P  (y)  =  exp  (-yk/8)  ,y=  (tk-tk)  1</k^0  . 

and  T.  is  the  jth  order  statistic  in  a  sample  of  size  n  . 


Equation  (4)  can  also  be  written  as 


E(T.  )=n 


nc_1 


j-llrJJ  (0zk+tk)  1/k{B(z)  }j_1{P(z)  }nc3+1kzk"1dz  (5) 


where  p  (z)  =  exp  (-zk)  ,  z  =  (y/0)^%  0. 

Now  E (T .  )  may  be  crudely  estimated  by 

3  :nc 

{9  {  E  (Z.  )}k+  tk}  1/k  (6) 

c 

where  E(Z.  )  is  the  expected  value  of  the  jth  order 
3  :nc 

statistic  from  a  sample  of  size  nc  determined  from  Harter’s 

A  A 

(1969)  tables  or  recurrence  relation  and  0  and  k  are 

maximum  likelihood  estimators  of  0  and  k  respectively. 

These  nc  estimated  expected  order  statistics  may  then  be 

treated  as  "observed"  lifetimes  in  adjusting  (or  "completing") 

the  estimated  survival  function  computed  in  (1).  The  area 

under  the  estimated  survival  function  up  to  t  remains  un- 

c 

changed.  The  area  under  the  extended  estimated  survival 
function  based  on  the  nc  estimated  expected  order  statistics 
is  then  added  to  the  initial  area  to  get  a  more  precise 
estimate  of  P(t)  (EOS  extension). 

2.2  Weibull  Maximum  Likelihood  Technioues 


A  straightforward  approach  to  completing  P(t)  is  to  set 


P(t)  =  exp  (-r/G) 


for  t>t 


Estimates  of  k  and  0  based  on  all  observations  can  be  obtained 
by  either  the  maximum  likelihood  (WTAIL)  or  the  least  squares 


method.  However ,  our  study  found  the  completion  using 
maximum  likelihood  estimators  was  always  better  in  terms  of 
bias  and  mean  squared  error. 


One  ostensible  suggestion  for  improvement  of  this 
estimator  would  be  to  "tie"  the  estimated  tail  to  the 
produc/t-limit  estimator  at  t  .  Two  methods  were  attempted 

w 

to  accomplish  this  goal.  First,  the  likelihood  was  maximized 

with  respect  to  k  and  0  subject  to  the  constraint  that 
k 

exp  (-t  /0)=P (t  ) .iThis  method  will  be  referred  to  as  the 

v  C 

restricted  MLE  tail  probability  estimate  (RWTAIL  extension) . 
Second,  a  scale-shift  was  performed  on  the  tail  probability 
in  (7)  so  as  to  tie  it  to  the  product-limit  estimator.  This 
method  led  to  higher  biases  and  mean  squared  errors  of  the  - 
survival  function  and  will  be  dropped  from  further  discussion 
in  this  paper. 

2.3  BHK  Type  Methods . 

The  Brown-Hollander-Korwar  completion  of  the  product-limit 
estimator  sets 

P(t)  =  exp  {-t/0*}  for  t>t  (8) 

C 

* 

where  0  satisfies 

P(tc)  =  exp  (-tc/0*). 

In  the  BHK  spirit  we  tried  to  complete  P(t)  by  a  Weibull 
function  which  used  estimates  of  k  and  e,k*  and  0*,  that 
satisfied  the  following  two  relations 


( 


.ft, .  j! 


*  >  ^  *  » 

UaJUJUla  * - 


.>  .V 


» 
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and 

5  (t(m-l)>  =  exP 

The  latter  method  also  led  to  consistently  poor  performance 
and  the  results  will  not  be  presented. 

3.  A  Comparison  of  the  Various  Methods 

A  simulation  study  of  data  like  that  collected  at  NCTR 
was  performed.  Three  groups  of  48  lifetimes  were  simulated 
with  all  testing  stopping  at  280,  420,  and  560  days  for  the 
three  groups,  respectively.  Distributions  with  mean  survival 
times  of  400,  500,  and  600  days  were  used.  The  generated 
lifetimes  greater  than  or  equal  to  the  sacrifice  time  for 
that  particular  group  were  considered  as  censored.  The 
remaining  set  of  observed  lifetimes,  along  with  the  number 
censored  at  the  three  sacrifice  times  constituted  a  single 
sample.  For  each  of  the  distributions  studied,  1000  such 
samples  were  generated.  Weibull  distributions  with  shape 
parameters  .5,  decreasing  failure  rate  ,  1,  constant  failure 
rate,  and  4,  increasing  failure  rate,  were  used.  Lognormal 
distributions,  failure  rate  changes  from  increasing  to 
decreasing,  with  first  two  moments  comparable  to  the  above 
Weibull  distributions  with  k=l  and  k=4  were  also  used. 
Finally,  a  bathtub  hazard  model,  of  Glaser  (1980),  failure 
rate  changes  from  decreasing  to  increasing,  was  used.  This 
distribution  is  a  mixture  of  an  exponential  of  parameter  \ 
with  probability  1-p  and  a  gamma  with  parameter  X  and  index 
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of  3  with  probability  p.  Mixing  parameters  of  p=.l  and  p=.4 
were  used. 

The  bias  and  MSE  for  the  estimation  of  the  tail  probabil¬ 
ities/  i.e.,  the  completed  portion  of  the  product-limit 
estimator,  were  calculated  for  each  hypothesized  distribution 
and  for  each  competing  method  of  completion.  Since  these 
results  were  extremely  similar  to  those  found  in  estimating 

mean  survival  time,  y=/**P(t)  dt,  we  only  show  the  bias  and 

o 

MSE  of  each  competing  estimator  of  y  in  Table  1.  This  also 
allows  us  to  demonstrate  the  magnitude  of  the  bias  and  MSE 
of  the  product-limit  estimator  of  ii.  The  bias  and  MSE  for 
estimating  the  90th  percentile  are  also  presented  for  the 
various  estimation  methods  in  Table  2.  As  one  would  expect, 
the  Kaplan-Meier  (K-M)  estimator  performs  considerably  more 
poorly  than  the  other  estimation  schemes.  The  BHK  extension 
does  very  well  if  the  underlying  distribution  is  exponential 
or  lognormal  with  first  two  moments  compatible  with  the 
exponential.  BHK  does  reasonably  well  for  the  bathtub  shaped 
hazard  model  but  i t  performs  very  poorly  for  the  Weibull  with 
increasing  failure  rate  and  for  the  lognormal  with  first  two 
moments  compatible  with  the  Weibull. 

The  remaining  three  extensions  (EOS,  WTAIL,  and  RWTA1L) 
appear  to  be  somewhat  comparable.  Each  of  them  are  best  under 
certain  circumstances  although  many  times  the  biases  and  MSE's 
are  so  close  to  one  another  that  they  are  essentially  equiva- 


lent.  Only  the  EOS  extension  has  the  desirable  property  of 
never  being  worst.  It  usually  is  competitive  with  the  method 
that  is  best. 

Ordering  the  extensions  from  the  standpoint  of  simplicity 
simplest  to  most  complex,  we  have  BHK,  WTAIL,  RWTAIL,  and  EOS 
In  summary ,  the  Kaplan-Meier  estimator  should  probably  be 
extended  in  the  presence  of  extreme  right  censoring.  One's 
choice  of  extension  depends  upon  one's  knowledge  of  the 
distribution  of  lifetimes  under  consideration  and  the  extent 
of  computer  facilities  available.  If  the  data  follow  an  ex¬ 
ponential  type  distribution  or  if  no  computer  facilities  are 
present  the  BHK  method  is  the  extension  of  choice  due  to  its 
simplicity.  If  the  data  exhibit  a  non-constant  failure  rate 
and  computer  facilities  are  available  then  the  RWTAIL  or  EOS 
extensions  seem  to  be  advisable  to  use. 
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Bias/ 100  (*nd  MSE/(100)2).  for  Estimating  Moan  Survival  Time 
for  Various  Maehods  of  Completion 


Distributions 
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TABLE  2 


Bias/100 (and  MSE/1100)2)  for  Estimating- 90th  Percentile 
for  Various  Methods  of  Completion 
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exponential  distributions  of  Block  and  Basu  (1974)  and  Marshall 
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1.  INTRODUCTION 

Let  X,  Y  have  either  the  bivariate  exponential  distribution 
(BVE)  of  Marshall  and  Olkin  (1967)  or  the  absolutely  continuous 
bivariate  exponential  distribution  (ACBVE)  of  Block  and  Basu 
(1974).  These  two  distributions  have  found  considerable  use  as 
models  for  both  physical  and  biological  systems.  The  problem 
of  interest  is  to  estimate  the  joint  reliability  function, 

F(x,y)  =  P(X>x,  Y>y) ,  for  each  of  these  two  distributions.  A 
natural  estimator  of  F(x,y)  is  obtained  by  substituting  in  the 
appropriate  expression  for  F(x.y)  good  estimators  of  the  model 
parameters.  Often,  as  seen  in  Pugh  (1963),  Basu  (1964)  or  Basu 
and  El  Mawaziny  (1978) ,  such  estimators  can  be  considerably 
biased.  We  wish  to  obtain  reduced  biased  estimators  of  F(x,y) 
for  both  the  BVE  and  ACBVE  distributions. 

In  Section  2  this  estimation  problem  is  considered  for  the 
ACBVE.  In  the  case  of  identically  distributed  marginals,  using 
the  Roa-Blackwell  and  the  Lehmann-Schef f e  theorems  we  obtain 
minimum  variance  unbiased  estimators  (UMVUE)  of  F(x,y).  In 
the  case  of  non-identically  distributed  marginals  this  approach 
fails  since  there  are  no  complete  sufficient  statistics.  Here 
we  investigate  the  performance  of  the  maximum  likelihood 


estimator  as  well  as  the  jacknifed  maximum  likelihood  estimator. 

In  Section  3  we  consider  the  estimation  of  F(x,y)  for  the 
BVE.  Again  there  are  no  complete  sufficient  statistics  so  no 
minimum  variance  unbiased  estimators  can  be  obtained.  Several 
different  methods  for  estimating  parameters  are  considered.  For 
each  estimation  procedure  we  consider  several  bias  reduction 
techniques . 

2.  ABSOLUTELY  CONTINUOUS  BIVARIATE  EXPONENTIAL 

2.1  Introduction 

Let  (X,  Y)  have  the  absolutely  continuous  bivariate 
exponential  distribution  of  Block  and  Basu  (1974)  with  parameters 

*2  >  °’  *12  -°  ^  ACBVE  *2’  *12))‘  This 
distribution  is  closely  related  to  the  bivariate  exponential  of 

Freund  (1961).  It  has  been  used  by  Gross,  Clark  and  Lui  (1971) 

and  Gross  (1973)  to  model  the  lifetimes  of  two  organ  systems 

and  by  Gross  and  Lam  (1981)  for  modeling  paired  survival  time 

data  such  as  survival  of  a  tumor  remission  when  a  patient 

receives  two  types  of  treatment. 

For  this  model  the  joint  reliability  function  is 
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F(x,y)  =  (-x~  yx~y  exp(-AlX-A2y-A12  max(x,y)) 


t— r  exp(-A  max(x.y)),  for  x,  y  >  0, 

^+a2) 


with  A  =  A^  +  A2  +  A^. 


(2.1.1) 


This  distribution  has  the  bivariate  loss  of  memory  property 
(LMP)  defined  by  Block  and  Basu  (1974) .  It  is  the  absolutely 
continuous  part  of  the  Marshall  and  Olkin  (1967)  bivariate 
exponential. 

We  shall  consider  two  cases  for  estimating  F(x,y),  one 
where  the  marginals  are  identically  distributed  and  the  general 
model. 


2.2  Equal  Marginals 

Consider  the  model  (2.1.1)  with  A^  =  A2  =  a  and  A^2  =  B. 
Let  (X;L>  y^) ,  ....  (xn,  yn)  be  a  random  sample  from  (2.1.1). 
Let  =  Emax(x^,  y^)  and  U2  =  E(x^  +  y^).  Mehrotra  and 
Michalek  (1976)  show  that  (U^,  U2)  is  a  complete  sufficient 
statistic  for  (a, 3).  The  MLE  of  a, 3  are  given  by 


u2~u  i  2uru2 


,  8  =  n 


2uru2  u2'ui 


(2.2.1) 
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These  estimators  are  biased  by  a  factor  of  n/(n-l)  so  the 

estimators  &  =  a  and  8  =  8  are  the  UMVUEof  i  and  B. 

n  n 

Two  natural  estimators  of  F(x,y)  are  obtained  by  sustituting 
either  (a,  8)  or  (a,  $)  in  (2.1.1). 

We  now  use  the  method  proposed  by  Basu  (1964)  to  obtain  the 
LTMVUE  of  F(x,y) . 


Define 


4>(x,y :  X,Y)  «  l  1  if  X  >  x,  Y  >  y 


otherwise 


(2.2.  2) 


Clearly  $(x,y:  X,Y)  is  an  unbiased  estimator  of  F(x,y)  based  on 

a  random  sample  of  size  one  from  a  ACBVE  (a, a, 8).  By  the  Rao- 

% 

Blackwell  and  Lehmanii-Scheffe  theorems  the  estimator  F(x,y)  = 
E(<Kx,y;X,Y)|u1,u2)  is  the  UMVUE  of  F(x,y). 

To  simplify  the  calculations  let  T  =  and  v  =  2U1_U2  ’ 

chat  is  T  =  Z  min(X^.Y^)  and  V  =  Zmax(X^,Y^)  -  Z  min(X^,Y^). 

From  Mehrotra  and  Michalek  (1976),  the  joing  density  of  (T,V)  is 

f(c,v)  -  (l2at8)n(a+_8,)n  cn-lvn-lexp(.(2a+g)t.(afB)v)  p  t,  v  >  0 
)  ( (n-1) ! ] 2 


otherwise. 


(2.2.3) 


Now  split  the  sample  of  size  n  into  two  independent  sub¬ 
samples  of  sizes  one  and  n-1,  respectively.  Let  (7.  ^ ,  Z^)  denote 
the  sample  of  size  one  and  let  T^ ,  denote  the  statistics  T 


and  V  defined  on  the  remaining  n-1  observations.  The  joint 
density  of  (Z^,  Z^,  T^,  V^)  is 

f(z  z-.t-.v .)  =  (2ot+e)  (ot+B)  t.n-2v  n_2exp[-(2a+6)t1-(u+B)v1- 

1211  2 1  (n-2) !  ] 2  1 

az1-(a+B)z2]  if  z^  <  z 2 


2a+B)n(a+S)n  _  n-2..  n-2 


2 [ (n-2) !  ] ' 


C1  -  V1  -  exp[  -(2a+B) c 

(a+B)z1~ci22 ]  if  z2  <  Zy  (2.2.4) 


for  t^,  >  0. 


Clearly  V  =  +  max(Z^,  Z2)  -  min(Z^,  Z2)  and  T  =  + 

min(Z^,  Z2).  Hence  the  joint  density  of  (Z^,  Z2 ,  T,  V)  is 

f (z  ,z,,t,v)  = 


(-2«+R  iffltPi-  (c-z  )n~2(v-z  +z  )n_2exp(-(2a+B)t  — (a+6)v)  for 
2 [ (n-1) ! ] 2 

0  <  v,  t;  0  <  z  <  C;  z,  <  z„  <  v  +  z. 


l  n  .  i  n\ n 


2 [ (n-1) ! ] ' 


(t-z2)  (v-z^+z2)exp(-2a+B) t-(a+B)v) 


0  <  v,t;  0  <  z_  <  t  ;  z,  <  z.  <  v  +  z- 


(2.2  5) 


Thus  the  conditional  distribution  of  Z^,  Z2  given  T,  V  is 


f  (z  tz2|T“t,  V=v)  = 


(n-1)2  (t-z1)n"2(v-z2+z,)n~2 
z  tn_1  vn~l 


-  >0  <  z  ^  <  t  , 

<  Z2  <  V  +  z^ 


(n-1)2  (t-z2)n"2(v-z1+z2)n-2 


2  tn_1  vn~l 


-  O  <  z2  <  t  • 

Z2  *  Z1  <  v  +  z2 


(2.26) 


.  V  .%  •.  •-  • 


•*«_-*  .'•‘.'•■o' 


To  find  E(<J>(x,y;  Zj^*  2^IT  =  v  =  v)  consider  three 


dz2dz^ 


(2.2.7) 


Case  1:  t>x=y>0.  Here, 


E(<|)(x,y;Z1,Z2)|T=t,  V=v)  = 


t  v+z.  (n-l)2(t-z.)  (v-z  +z  ) 

2/  /  —  -  n-r-^i— . — 

x  z,  2  t  v 


(i-  f)"-1 


Case  2:  x  <  y  <  t.  Here, 

E(<J)(x,y;Z1,Z2)|T=t,  V=v)  =  /  /  f  (Zj  ,z2 1 1  ,v)dz1dz2  + 

{(zltz2) :z1>y,z2>y} 

+  /  /f »  z2|t,v)dz1dz2  =  (1  -  ^)n_1  + 
[(zltz2):  x<zx<y,  y<z^ 


7f\  <n-l)2(t-zl)rl"2(v+i  -j2)"'2 

+  J  J  i - - .  ■  -  ■  - -  dz  dz1  =  (l-f)n  1  + 

x  y  2  n-1  n-1  21  t 

J  tv 


+  -frfri^1  (-1)k  (  (n!kVir~~~  [(t-x)n+k-1-(t-y)n+k- 
2vn~1tn_1  k=0  (n+k_L) 

(2.2.8) 


Case  3:  t  >  x  >  y  -  0. 


By  symmetry  E($(x,y;  Z^zplT-t,  V=v)  =  (l-*)n_1  + 


(n-1)  y 

0  n-1  n-1  , 

2v  t  k=0 


n-1  ,  ,  .k, 


(n+k-1) 


/  ,  n-l-kf . n+k-1  ,  .n+k-1, 

(v-x+t)  l(c-y)  -(t-x)  ). 

(2.2.9) 


.  <  ..  «  ajf «-*  *J».  ^ %-A .  .%-•  -w'.  «w-  O  .%■  .  ■  ' 


When  (X,Y)  is  ACBVE  (A^.A^,  A^)  with  A^  not  known  to  be  equal  to 
A^,  there  does  not  exist  a  set  of  complete,  sufficient  statistics  for 
(A^.A^.A^).  Hence,  the  technique  described  in  section  2.2  fails. 

Maximum  likelihood  estimators  of  (A^.A^.A^)  are  °^ta^net^  numerically  by 
maximizing  the  likelihood  equation  as  described  in  Block  and  Basu  (1974). 

/v 

The  maximum  likelihood  estimator,  F(x,y)  of  F(x,y)  is  obtained  by 
substituting  these  estimators  into  (2.1.1). 

For  small  sample  sizes,  this  estimator  may  be  highly  biased.  To 
reduce  this  bias  we  consider  the  jackknifed  version  of  the  MLE  estimator. 


This  estimator  is  constructed  as  follows:  Let  F^\.(x,y)  be  the  MLE  of 

(n-l) 

F(x,y)  based  on  the  subsample  of  size  n-l  obtained  by  deleting  the  j th 
observation  from  the  original  sample.  The  jackknifed  version  of  F(x,y)  is 


w*-6’  ■ "  ^•>r)  -  ^  z. 

j=i 


(2.3.1) 


Miller  (1974)  shows  that  this  estimator  removes  the  n  ^th  order  term  in 
the  expansion  of  the  bias  of  F(x,y). 

To  study  the  performance  of  the  MLE  and  the  jackknifed  MLE  of 
F(x,y),  a  simulation  study  was  performed.  For  various  values  of  A^ , A2 » A ^ ^ 
and  n,  500  ACBVE  samples  were  generated  by  the  method  of  Friday  and 
Patil  (1977).  Values  of  (x,v)  were  picked  so  that  F(x,y)  =  .9.  The 
study  showed  that  the  jackknifed  maximum  likelihood  estimator  had 
significantly  smaller  bias  than  the  MLE.  For  sample  sizes  of  10  or 
larger,  the  bias  of  this  estimator  is  not  scat isc icallv  different  from 
zero.  However,  the  jackknifed  MLE  has  a  slightly  larger  mean  squared 


error  than  the  MLE  in  all  cases  considered. 


3.  BIVARIATE  EXPONENTIAL 


3.1  Parameter  Estimation 

We  say  (X,Y)  follows  the  bivariate  exponential  distribution  of 
Marshall  and  Olkin  (1967)  with  parameters  A^  >  0,  >  0,  and 

•\^2  0((X,Y)  is  BVE  (A^,  A^))  if  the  joint  survival  function  is 

P (X  >  x,  Y  >  y)  «  F(x,y)  =  expf-A^x-A^-A^  max(x,y))  (3.1.1) 

for  x  >  y  >  0.  This  distribution  is  not  absolutely  continuous  since 

P(X  =  y)  =  +  ^2  +  ^12^  ‘  The  mar8iflaAs  are  exponential  as  is 

min(X,Y).  This  is  the  only  bivariate  distribution  with  exponential 
marginals  and  the  loss  of  memory  property  (LMP)  as  defined  in  Block  and 
Basu  (1974)  . 

To  estimate  A^ ,  A^,  A^  based  on  a  random  sample  (X^,  Y^),  ...  (X^,  Y^), 
let  n^,  n^ ,  n^2  be  the  number  of  observations  with  Xi  less,  greater,  and 
equal  to  Y^,  respectively.  Let  T  =  EmaxfX^,  Y^,  =  EX^,  “  EY^. 

Bhattacharyya  and  Johnson  (1971)  show  that  (n^,  n^,  t2,  sx,  sy)  are 
jointly  minimal  sufficient  but  not  complete.  Hence,  the  approach  of 
section  2.1  cannot  be  applied.  The  maximum  likelihood  estimators  are 
obtained  by  numerically  maximizing  the  likelihood  equations.  Bhattacharyya 
and  Johnson  (1971)  obtain  conditions  under  which  the  MLE  exist,  and  show 
that  these  estimators  are  asymptotically  trivariate  normal  with  mean 

(V  '2’  X12K 

Bemis,  Bain,  and  Higgins  (1972)  have  obtained  method  of  moments 
estimators  of  the  parameters.  Proschan  and  Sullo  (1976)  obtained 
estimators  of  the  parameters  by  using  a  first  iterate  in  the  likelihood 
equations.  Arnold  (1968)  gives  estimators  of  based  on  n^,  n2>  n^  ,  and 
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U  =  T.  min  (X^,  Y^).  In  Che  compeCing  risks  framework  where  only  the 
minimum  of  X  and  Y  is  observed,  these  estimators  are  the  unique  minimum 
variance  unbiased  estimators  of  A^.  All  of  the  above  estimators  are 
asymptotically  crivariate  normal  with  mean  (A^,  X2,  X^2). 

3.2  Estimation  of  Tail  Probability 

The  problem  of  interest  is  to  estimate  F(x,y)  given  by  (3.1.1). 

A  natural  method  of  estimating  (3.1.1)  is  to  use  one  of  the  above  methods 
to  estimate  (A  ,  X2>  Al2)  and  substitute  these  estimates  in  (3.1.1). 

Several  methods  may  be  uwed  to  reduce  the  bias  of  these  estimators. 

The  first  approach  is  to  expand  the  substitution  estimator  in  a  Taylor 

series  about  (A^ ,  X2,  A^)  keeping  only  second  order  terms.  When 

>\ 

E(X  )  =  A  ,  che  bias  of  Che  substitution  estimator  is  approximately  equal  to 
1  1 


E(F  (x,y))  =  F(x,y)  [1  +  c2/2]  where 


O2  =  (x,y,  max(x,y))I(x,y,  raax(x,y))’ 


(3.2.1) 


and  E  is  the  appropriate  covariance  matrix  of  ( X^ ,  X^ ,  X^). 
suggests  a  reduced  bias  estimator  of  F(x,y)  given  by 


FTs(x,y)  =  FSUB(x’y)/[1  +  a  /2] 


(3.2.2) 


a  2 

where  a2  is  an  estimator  of  0  . 


A  second  approach  to  the  bias  of  ^  through 

A 

asymptotic  theory.  Note  that  In  FSUg(x,y)  is  asymptotically 

2 

normal  with  mean  -X ^x  -X2y  -X ^2  max(x,y)  and  variance  ®  •  Thus, 
for  large  n,  FSUB(x,y)  has  a  log  normal  distribution  and 


■*  «*•  /*-  "  -  • 


’  %  * 


E(?SUB(x,y))  E(x*y)  exP(°2/2>  (3.2.3) 

a  *  2  2 

and  V(FSUB(x,y))  F(x,y))2e  (e  -1).  This  suggests  a  reduced 
bias  estimator  of  F(x,y)  given  by 

~  ~  ^ 

^LN^X,y^  =  ysUB^x,y^exp^2//2^  ‘  (3.2.4) 


A  third  method  to  reduce  the  bias  of  FPTI_(x,y)  is  the  jackknife 

^  bUb 

as  described  in  Section  2.3. 

To  compare  these  estimators,  a  simulation  study  was  performed. 

500  BVE  observations  were  generated  for  various  combinations  of 
n’  *2’  *12”  Values  <x«y)  were  selected  so  that  F(x,y)  =  .9. 

Several  conclusions  can  be  drawn  from  the  study.  First,  for  all 
bias  reduction  techniques,  those  based  on  Arnold’s  estimators  have  a 
significantly  larger  mean  squared  error  but  a  smaller  relative  bias. 
Secondly,  there  appears  to  be  very  little  difference  in  the  estimators 
based  on  the  other  three  methods .  For  Arnold's  estimators,  all  three 
bias  reduction  techniques  yield  approximately  unbiased  estimators  with 
comparable  mean  squared  errors.  For  the  other  methods,  onry  the 
jackknifed  estimator  is  approximately  unbiased  due  to  bias  of  the 
parameters  themselves.  Our  recommendation  is  to  jackknife  either  the 


Proschan  and  Sullo  estimator  or  the  method  of  moments  estimator 
since  these  are  computationally  easier  than  the  MI.E. 


REFERENCES 


Arnold,  B.  C.  (1968).  "Parameter  estimation  for  a  multivariate 

exponential  distribution,"  Journal  of  the  American  Statistical 
Association,  63,  848-52. 

Basu,  A.  P.  (1964).  "Estimates  of  reliability  for  some  distributions 
useful  in  life-testing,"  Technometrics ,  6,  215-219. 

Basu,  A.  P.  and  El  Mavaziny  (1978).  "Estimates  of  reliability  of 
k-out-of-m  structures  in  the  independent  exponential  case,” 

Journal  of  the  American  Statistical  Association,  63,  850-854. 

Bemis,  B.  M.,  Bain,  L.  J.,  and  Higgins,  J.  J.  (1972).  "Estimation  and 
hypothesis  testing  for  the  parameters  of  a  bivariate  exponential 
distribution,"  Journal  of  the  American  Statistical  Association,  67, 
927-929. 

Bhattacharyya,  G.  K.,  and  Johnson,  R.  A.  (1971).  "Maximum  likelihood 
estimation  and  hypothesis  testing  in  the  bivariate  exponential 
model  of  Marshall  and  Olkin,  Technical  Report  No.  276, 

Department  of  Statistics,  University  of  Wisconsin. 

Block,  H.  W. ,  AND  Basu,  A.  P.  (1974).  "A  continuous  bivariate  exponential 
extension,"  Journal  of  the  American  Statistical  Association,  69, 
1031-1037. 

Freund,  J.  E.  (1961).  "A  bivariate  extension  of  the  exponential  distri¬ 


bution,"  Journal  of  the  American  Statistical  Association,  56,  971-977. 


Friday,  D.  S.,  and  Patil,  G.  P.  (1977).  "A  bivariate  exponential  model 
with  applications  to  reliability  and  computer  generation  of  random 
variables,"  in  The  Theory  and  Application  of  Reliability,  eds., 

C.  P.  Tsokos  and  I.  N.  Shimi,  New  York,  Academic  Press,  Inc., 

527-549. 

Gross,  A.  J.  (1973),  "A  competing  risk  model:  a  one  organ  subsystem  plus 
a  two  organ  subsystem,"  IEEE  Transactions  on  Reliability,  R-22,  24-27. 

Gross,  A.  J.,  Clark,  V.  A.,  and  Liu,  V.  (1971).  "Estimation  of  survival 
parameters  when  one  of  two  organs  must  function  for  survival," 
Biometrics ,  27,  369-377. 

Marshall,  A.  W.,  and  Olkin,  I.  (1967).  "A  multivariate  exponential 

distribution,"  Journal  of  the  American  Statistical  Association,  62, 
30-40. 

Mehrotra,  K.  G.  and  Michalek,  J.  E.  (1976).  "Estimation  of  parameters 
and  tests  of  independence  in  a  continuous  bivariate  exponential 
distribution,"  unpublished. 


